Cross Reference: /dalvik/vm/mterp/out/InterpAsm-armv7-a-neon.S

History log of /dalvik/vm/mterp/out/InterpAsm-armv7-a-neon.S
Revision	Date	Author	Comments
5dfcc78af479937ba8dafceefd9b1931a88dfaaf	11-Aug-2012	Ard Biesheuvel <ard.biesheuvel@gmail.com>	hardening: eliminate all text relocations from lidbvm This patch consists of: - changes to mterp/ that turn all literals from absolute to PC relative, so the relocations can be resolved at (build) link time - changes to compiler/template/ that result in the compiler templates to live in the non-executable .data.rel.ro section (this code is never executed directly, only from the jit heap, so there is no reason to put it in the .text section) Change-Id: I2dc97bd4720b393a74b7277a188f0c7b681fc932 Signed-off-by: Ard Biesheuvel <ard.biesheuvel@gmail.com>
8b095215a4d5bde723819087f3455bdcc250a78f	20-Jun-2012	David Butcher <david.butcher@arm.com>	Switched code to blx <reg> ldr ip,<addr> blx ip is preferred over mov lr,pc ldr pc,<addr> from armv5te, and will typically perform better on later ARM processors. Change-Id: I8f2e5e794c644faafd767037ad56579f2934de47
ab35b50311951feea3782151dd5422ee944685c2	05-Jan-2012	Elliott Hughes <enh@google.com>	Remove unsupported experimental opcodes. External developers were starting to try to get themselves into trouble with this stuff... Change-Id: I2b03bfeaa8c98b6a994bc7924fc8dcf4e4d4f6cb
4185972e211b0c84b9fe7d90c56b28cc15e474fa	27-Sep-2011	buzbee <buzbee@google.com>	Fix memory barriers (Issue 3338450) Add extra memory barrier on volatile stores. Change-Id: Id4a4750cdfc910eda2f0b44ead0af2a569b5735e
291c84f60853d30e1c0d79dd08c5e5164f588e26	26-May-2011	Dan Bornstein <danfuzz@android.com>	Prefer printf format "%#x" over "0x%x". I exist to serve. Change-Id: I8e2880b20eefd466da8515d5b6b0c5cb75d56169
97b22b8d41742fa84812f46d1125e9735420782a	23-May-2011	buzbee <buzbee@google.com>	Fix alt handling for Jumbo ops The mterp alt mechanism allows us to break out of the fast interpreter loop to handle debugging, JIT trace selection and other non-standard behavior. It does this by replacing the base pointer to the instruction handlers with an alternate base that references a set of dummy handlers that first call out the dvmCheckBefore() to handle special actions before routing control to the real handlers. This mechanism was slightly broken for the Jumbo ops - which have a first opcode byte of 0xFF (Dispatch), followed by the Jumbo opcode byte. In short, when the altHandlerBase was active dvmCheckBefore() was getting called for both the dispatch opcode byte and the Jumbo byte. This change adds special ALT_OP_DISPATCH_FF handlers which skip the dvmCheckBefore() call. Change-Id: If57c298a33404cdaca7456bc8fe1159c70240bea
00ceb87d1c57ccee59966be4deef1292a049285c	03-May-2011	buzbee <buzbee@google.com>	Fix for double breakpoint (issue 4378296) Breakpoints are given special handling in the interpreter. They are first interpreted as a breakpoint (with notification to the debugger), and then the actual instruction associated with the breakpoint location is interpreted. The bug here was that the "dvmCheckBefore()" handler was invoked prior to both "interpretations" - rather than just the first. Note that this defect appears only in the Arm mterp, the portable interpreter and x86 mterp did the right thing. Change-Id: Ied957edc0c248b5d4d94910beb7af6c03ffe885d
cf2aac7e6a29e7e1e5f622fd6123e0d1a9a75bda	25-Apr-2011	buzbee <buzbee@google.com>	Refine & simplify the interBreak mechanism Replace dvmUpdateInterpBreak() and friends with more direct enable/disable subMode calls. Hide breakFlags manipulation from higher-level callers and infer what is needed from the active subMode. Add documentation to the interpreter control section of mterp/README.txt Change-Id: If7ebee5d8e4db8154c4caed72cf89ec088045998
389e258a5b9b2afb7bfaee3344c615d3310fae4e	23-Apr-2011	buzbee <buzbee@google.com>	InterpBreak cleanup (part 1) Moved the suspend count variables from the interpBreak structure. These are already protected by a mutex, and we need the space in interpBreak for additional subMode flags. This CL just does the move and expands the width of subMode to 16-bits. Change-Id: I4a6070b1ba4fb08a0f6e0aba6f150b30f9159eed
30bc0d46ae730d78c42c39cfa56a59ba3025380b	22-Apr-2011	buzbee <buzbee@google.com>	Consolidate curFrame fields in thread storage We ended up with two locations in the Thread structure for saved Dalvik frame pointer. This change consolidates them. Change-Id: I78f288e4e57e232f29663be930101e775bfe370f
99e3e6e72e3471eb85fc2e405866392b01c080fe	29-Mar-2011	buzbee <buzbee@google.com>	Fix interpreter debug attach Fix a few miscellaneous bugs from the interpreter restructuring that were causing a segfault on debugger attach. Added a sanity checking routine for debugging. Fixed a problem in which the JIT's threshold and on/off switch wouldn't get initialized properly on thread creation. Renamed dvmCompilerStateRefresh() to dvmCompilerUpdateGlobalState() to better reflect its function. Change-Id: I5b8af1ce2175e3c6f53cda19dd8e052a5f355587
9a3147c7412f4794434b4c2604aa2ba784867774	03-Mar-2011	buzbee <buzbee@google.com>	Interpreter restructuring This is a restructuring of the Dalvik ARM and x86 interpreters: o Combine the old portstd and portdbg interpreters into a single portable interpreter. o Add debug/profiling support to the fast (mterp) interpreters. o Delete old mechansim of switching between interpreters. Now, once you choose an interpreter at startup, you stick with it. o Allow JIT to co-exist with profiling & debugging (necessary for first-class support of debugging with the JIT active). o Adds single-step capability to the fast assembly interpreters without slowing them down (and, in fact, measurably improves their performance). o Remove old "polling for safe point" mechanism. Breakouts now achieved via modifying base of interpreter handler table. o Simplify interpeter control mechanism. o Allow thread-granularity control for profiling & debugging The primary motivation behind this change was to improve the responsiveness of debugging and profiling and to make it easier to add new debugging and profiling capabilities in the future. Instead of always bailing out to the slow debug portable interpreter, we can now stay in the fast interpreter. A nice side effect of the change is that the fast interpreters got a healthy speed boost because we were able to replace the polling safepoint check that involved a dozen or so instructions with a single table-base reload. When combined with the two earlier CLs related to this restructuring, we show a 5.6% performance improvement using libdvm_interp.so on the Checkers benchmark relative to Honeycomb. Change-Id: I8d37e866b3618def4e582fc73f1cf69ffe428f3c
8cd640b8327e2591c8dd8a69093fa1fc6c901c05	23-Mar-2011	Andy McFadden <fadden@android.com>	Fix some exception issues The function that obtained an exception's message string was making a bad assumption. This has been corrected. Also, in the invoke-object-init functions, we now test for a pending exception after the call to dvmSetFinalizable(). Also, make the test for pending exception at the end of VM startup an error rather than an assert. Bug 4121213 Change-Id: I6912e5c79d63e8dda1a1dc2e788c7a8edcf487aa
3475f9cdb47a6d6f8ad2ce49bbc3af46bca92f09	21-Mar-2011	Carl Shapiro <cshapiro@google.com>	Move finalization out of the VM. This change introduces a new reference class whose referent points to instances requiring finalization. This makes the finalization of objects possible using a reference queue and a dedicated thread which removes objects from the queue. Change-Id: I0ff6dd272f00ca08c6ed3aa667bf766a039a944e
61f4c7e40b885ccb0a55d9553f07a888469621dc	16-Mar-2011	Dan Bornstein <danfuzz@android.com>	Clean up ArrayStoreException some more. Each of the four variants thrown by the VM now has a descriptively-named function defined in Exception.c, and the messages uniformly use human-oriented class names instead of the internal "[[Lfoo/bar/Baz;" forms. Bug: 3500987 Change-Id: I747315e36005c6d352116ce6a8af9d49c622f59a
24bd4c50bb3ea13be4f049710967961f0546fb2c	10-Mar-2011	Andy McFadden <fadden@android.com>	Add volatile/jumbo opcodes This adds 12 dexopt-generated "volatile/jumbo" instructions, to be used for replacing appropriate get/put ops, plus a jumbo replacement for invoke-object-init/range. The new instructions are defined but not yet used. For x86 and x86-atom, C stubs are selected. Also, guarded macro args used in arithmetic expressions in header.S. Bug 3403118 Change-Id: I283cea053d1cee1d70c3715df3e71177e8b8d3b2
2ff04ab635eeba79c2dad82850c34188abcdfe62	09-Mar-2011	Andy McFadden <fadden@android.com>	Fix method profiling Moved a couple of things out of MethodTraceState so they don't get zeroed after being initialized. Also, rearranged the native method invocation path slightly so the common case runs uninterrupted. Change-Id: I0dad007a7f344d93f30444156e67f20bed6606a4
47f58250c5177adba475b0b11a36151ac0ce9ab9	07-Mar-2011	Dan Bornstein <danfuzz@android.com>	Consistency in exception throws. Make the messages that consist of a series of values consistently use semicolons between the values, and make the call order for exception throws that take both "info about a thing" as well as "info about a use of that thing" take the "info about a thing" argument first. Practical upshot: Adding a second semicolon in the message for StringIndexOutOfBoundsException being thrown for a region, and switching the order of arguments of dvmThrowArrayIndexOutOfBoundsException(). Bug: 3500987 Change-Id: I97eb0046ab8997a68e2d6dfde5dbf3d02290c1f7
0346e9dcddccd449c731e42ef83708ff6d8f0976	02-Mar-2011	Andy McFadden <fadden@android.com>	Change invoke-object-init to /range form The invoke-object-init instruction pretends to be a regular invoke that only knows how to call Object.<init>. As such it always takes one argument, and if we use the /range version we can specify the "this" register with 16 bits instead of only 4. Bug 3486699 Change-Id: I9ee4700c6935beee1dcbaa583b57befd33641414
3d054be0780e2bee9553711d409608495cc2c19e	02-Mar-2011	buzbee <buzbee@google.com>	mterp generation cleanup Change I3a22048a introduced a new interpreter breakout mechanism, and with it a bit of hackish ugliness in the mechanism to automatically generate interpreter source files. This CL applies some Lipo and Botox: o New alt-op-start, alt-op-end commands removed - will just use existing op-start & op-end. o New command "handler-style" to explicitly declare interpreter style (computed-goto, jump-table or all-c). Previous trigger on "handler-size==0" removed. o Alternate handler stub no longer using fixed file name, but intead is named by command asm-alt-stub (which is modelled on existing alt-stub command). o Previous CL stated requirement for explicitly called-out handler for the Dalvik dispatch opcode. Turns out this was not necessary. Requirement removed. Change-Id: I20f7411820715476533c2073d28f357e28c1ae52
98f3eb12bf2a33c49712e093d5cc2aa713a93aa5	01-Mar-2011	buzbee <buzbee@google.com>	Exception cleanup in the assembly interpreters Removed the last of the "exception as strings" calls from the assembly interpreters, replacing them with the helper functions. Change-Id: I4c44cde348ed7d2ea99f908bc22166afeb5e3d37
a7d59bbafea5430fe81fc21ba94ddf6f6a63b0b3	24-Feb-2011	buzbee <buzbee@google.com>	New interpreter breakout mechanism Introduce parallel handler entry points for mterp interpreters as a step towards fully supporting debug, profile and JIT within mterp (instead of bailing out to the portable debug interpreter). This CL contains most of the structural changes that need to happen, but does not yet enable the new switch mode. In short, within the mterp assembly interpreter register rIBASE points to an array of handlers for Dalvik opcodes. Instead of periodically checking for suspend, debug, profiling and JIT trace selection breakouts, rIBASE may simply be altered to point to the parallel breakout handlers when control needs to be rerouted. This will enable us to eliminate the separate portable debug interpreter and the entire mechanism of switching between the fast and portable interpreters. The x86 implementation required a large number of changes because of the need to dedicate a register to holding the table base. It will now use %edx (which was previously scratch). Changes include: o Support for two styles of mterp assembly code generation: computed goto and jump table (ARM uses computed goto, x86 uses jump table) o New mterp config operators to trigger generation of alternate entry points. o Alternate entries route execution through new dvmCheckInst(). That's where the checking code will go. o For x86, reserved register edx as dedicated rIBASE. o For jump-table mterps, ignore "%break" operator and allow variable-sized handlers with no "sister" region. Note that the x86-atom implementation will need substantial changes to function in this new model. Change-Id: I3a22048adb7dcfdeba4f94fbb977b26c3ab2fcb3
8cb0d098d79af61546e275f633325794f4587602	28-Feb-2011	buzbee <buzbee@google.com>	Use new negative array size exception reporting Follow-up to change 98624 to enhance assembly interpreters to use the new dvmThrowNegativeArraySizeException. Change-Id: I9c8b425b3255d42afa1dc466024c03eeeb4eec23
74501e600dcb5634aa26aee0a3f57f2b45b213f2	24-Feb-2011	Dan Bornstein <danfuzz@android.com>	Round three of exception cleanup. I expanded AIOOBE since it was the odd one out, migrated the wrappers in Exception.h to the end of the file where they're less disruptive, and tweaked a couple other throws in the main vm code. Change-Id: Iae11fda2c47989ce7579483df226124ffeb2ac84
9f601a917c8878204482c37aec7005054b6776fa	12-Feb-2011	buzbee <buzbee@google.com>	Interpreter restructuring: eliminate InterpState The key datastructure for the interpreter is InterpState. This change eliminates it, merging its data with the Thread structure. Here's why: In principio creavit Fadden Thread et InterpState. And it was good. Thread holds thread-private state, while InterpState captures data associated with a Dalvik interpreter activation. Because JNI calls can result in nested interpreter invocations, we can have more than one InterpState for each actual thread. InterpState was relatively small, and it all worked well. It was used enough that in the Arm version a register (rGLUE) was dedicated to it. Then, along came the JIT guys, who saw InterpState as a convenient place to dump all sorts of useful data that they wanted quick access to through that dedicated register. InterpState grew and grew. In terms of space, this wasn't a big problem - but it did mean that the initialization cost of each interpreter activation grew as well. For applications that do a lot of callbacks from native code into Dalvik, this is measurable. It's also mostly useless cost because much of the JIT-related InterpState initialization was setting up useful constants - things that don't need to be saved and restored all the time. The biggest problem, though, deals with thread control. When something interesting is happening that needs all threads to be stopped (such as GC and debugger attach), we have access to all of the Thread structures, but we don't have access to all of the InterpState structures (which may be buried/nested on the native stack). As a result, polling for thread suspension is done via a one-indirection pointer chase. InterpState itself can't hold the stop bits because we can't always find it, so instead it holds a pointer to the global or thread-specific stop control. Yuck. With this change, we eliminate InterpState and merge all needed data into Thread. Further, we replace the decidated rGLUE register with a pointer to the Thread structure (rSELF). The small subset of state data that needs to be saved and restored across nested interpreter activations is collected into a record that is saved to the interpreter frame, and restored on exit. Further, these small records are linked together to allow tracebacks to show nested activations. Old InterpState variables that simply contain useful constants are initialized once at thread creation time. This CL is large enough by itself that the new ability to streamline suspend checks is not done here - that will happen in a future CL. Here we just focus on consolidation. Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
6af2ddd107842c3737c04c37343cac9be17f4209	17-Feb-2011	Andy McFadden <fadden@android.com>	Defer marking of objects as finalizable This shifts responsibility for marking an object as "finalizable" from object creation to object initialization. We want to make the object finalizable when Object.<init> completes. For performance reasons we skip the call to the Object constructor (which doesn't do anything) and just take the opportunity to check the class flag. Handling of clone()d object isn't quite right yet. Also, fixed a minor glitch in stubdefs. Bug 3342343 Change-Id: I5b7b819079e5862dc9cbd1830bb445a852dc63bf
b387fe1b970a216c09d2abc98c893ff1fff3e512	16-Feb-2011	Andy McFadden <fadden@android.com>	Fix some asm .size directives We were missing a .size directive for dvmPlatformInvoke, and the directive for the mterp handlers wasn't being handled right. Threw in a bonus directive for the entry point and the "assist debugger" stuff that wraps method calls. Bug 3456786 Change-Id: Ideee64a496e54eb09008410e9e9eba652b59f403
d3a92b577f11c6357c76dc850c6cbf352ef4c760	15-Feb-2011	buzbee <buzbee@google.com>	Remove spurious code from bad merge/pilot error Looks like I spuriously duplicated two lines of assembly code in footer.S around a call to method profiling. Removed. Change-Id: I0b10656e15eba0d16af8784ae9ca09c33b5096c0
750d110b62cef538e193b6f91f5239b0c4b63ef1	12-Feb-2011	Andy McFadden <fadden@android.com>	Rename invoke-direct-empty to invoke-object-init The invoke-direct-empty instruction was introduced to remove the overhead of calling the empty Object constructor. We now need it to do some extra work on behalf of object construction, so it's appropriate to change the instruction name to match the role it fills rather than the more general role it was hoped to fill. No functional changes. Bug 3342343 Change-Id: I65dd6a2c00c99581c9a19b16fe193b70642c8fbb
01605d2b668e8e1701cfdfa302dde847b9171fc9	01-Feb-2011	Carl Shapiro <cshapiro@google.com>	Remove the unused monitor tracking and deadlock prediction code. This feature has been in the code base for several releases but has never been enabled. Change-Id: Ia770b03ebc90a3dc7851c0cd8ef301f9762f50db
cfdeca37fcaa27c37bad5077223e4d1e87f1182e	14-Jan-2011	Ben Cheng <bccheng@android.com>	Add runtime support for method based compilation. Enhanced code cache management to accommodate both trace and method compilations. Also implemented a hacky dispatch routine for virtual leaf methods. Microbenchmark showed 3x speedup in leaf method invocation. Change-Id: I79d95b7300ba993667b3aa221c1df9c7b0583521
18fba346582c08d81aa96d9508c0e935bad5f36f	20-Jan-2011	buzbee <buzbee@google.com>	Support traceview-style profiling in all builds This change builds on an earlier bccheng change that allowed JIT'd code to avoid reverting to the debug portable interpeter when doing traceview-style method profiling. That CL introduced a new traceview build (libdvm_traceview) because the performance delta was too great to enable the capability for all builds. In this CL, we remove the libdvm_traceview build and provide full-speed method tracing in all builds. This is done by introducing "_PROF" versions of invoke and return templates used by the JIT. Normally, these templates are not used, and performace in unaffected. However, when method profiling is enabled, all existing translation are purged and new translations are created using the _PROF templates. These templates introduce a smallish performance penalty above and beyond the actual tracing cost, but again are only used when tracing has been enabled. Strictly speaking, there is a slight burden that is placed on invokes and returns in the non-tracing case - on the order of an additional 3 or 4 cycles per invoke/return. Those operations are already heavyweight enough that I was unable to measure the added cost in benchmarks. Change-Id: Ic09baf4249f1e716e136a65458f4e06cea35fc18
cb3081f675109049e63380170b60871e8275f9a8	14-Jan-2011	buzbee <buzbee@google.com>	Consolidate mterp's debug/profile/suspend control This is a step towards full debug & profiling support in JIT'd code. Previously, the interpreter made multiple distinct checks for pending suspend requests, debugger and profiler checks at each safe point. This CL moves the individual controls into a single control word, significantly speeding up the safe-point check code path in the common fast case. In short, any time some VM component wants control to break at a safe point it will set a bit in gDvm.interpBreak, which will be examined at the safe point check in footer.S. In the old code, the safe point check consisted of 11 instructions (including 6 loads). The new sequence is 6 instructions (4 loads - two of which are needed and two are speculative to fill otherwise stalling slots). This code path is hot enough in the interpreter that we actually see some measureable speedups in benchmarks. The old sieve benchmark improves from 252 to 256 (~1.5%). As part of the change, global debuggerActive and activeProfilers variables have been eliminated as redundant. Note also that there is a subtle change in thread suspension. Thread suspend request counts are kept on a per-thread basis, and previously each thread would only examine its own suspend count. With this change, a bit has been allocated in interpBreak to signify that at least one suspend request is active across all threads. This bit is treated as "some thread is supposed to suspend, check to see if it's me". Change-Id: I527dc918f58d1486ef3324136080ef541a775ba8
71eee1f0c2eb514585fdbee16730c9c2209e8f68	04-Jan-2011	jeffhao <jeffhao@google.com>	Added vm support for new jumbo opcodes. This enables jumbo opcodes by default, and they will get used by the current build without modification. Support has been added for arm, x86, and the portable interpreter. x86-atom support is on the TODO list. This commit also includes a test for the new jumbo opcodes. Change-Id: Ic3f1b41b51645861c5196f76aaf0e96e727ea537
90f15431b24a4004fab2db70f273155fcd1c42a4	03-Dec-2010	Dan Bornstein <danfuzz@android.com>	Make opcode 00ff be called "dispatch-ff". With this change, it's still implemented as an unused opcode, but it's now ready for its new life! Change-Id: Ic70d311704925067e47d87b657d133a792144e65
63644657f74e0a5d05f2c5fb56a18872e7ac7427	20-Nov-2010	Elliott Hughes <enh@google.com>	Better ArrayStoreException detail messages. This fixes the portable interpreter, ARM, and x86. System.arraycopy was already doing the right thing. Bug: 3216051 Change-Id: I8a675eb62d6e7fd53a009f53ce8e34f93799b18c
c560e30f68265068bed9eadf174d1e76288d2952	18-Nov-2010	Elliott Hughes <enh@google.com>	Include both types in ClassCastException detail messages. Along the lines of "java.lang.Exception cannot be cast to java.lang.String". This is ARM and portable interpreter only. x86 will come later. Bug: 3210374 Change-Id: I48719dbdb569bbc3be2a31d0e5507b8dc42101b3
8c9ac9ab0ab6fd75b73cb0d99005da3aa90c167c	22-Oct-2010	Ben Cheng <bccheng@android.com>	Avoid conditional loads if WORKAROUND_CORTEX_A9_745320 is defined. No noticeable performance impact by this change. Bug: 3117632 Change-Id: I31c6adc6cb9999498bb456f1e87f6f04f33e4144
d88756df5b4dbc6fd450afd0019a5f64ebe4432d	22-Oct-2010	Elliott Hughes <enh@google.com>	Remove junk from platform.S now armv4t is gone. Change-Id: I30079aacc753c89cfc3a3f64bd900a0cc858d65f
e877cc41c1a5d4f577c5f6fc6bacbe388dfd1d59	21-Oct-2010	Elliott Hughes <enh@google.com>	Detail messages for ArrayIndexOutOfBoundsExceptions. This adds ARM fast interpreter and JIT support. x86 is still missing. Change-Id: Ide46fd9dcd06780193848f594ce7d1491d7f5a96
0016024cdd2bdeef3b98c92f7a8f40a2bc1ff42d	21-Oct-2010	Elliott Hughes <enh@google.com>	Better detail messages in ArrayIndexOutOfBoundExceptions. The RI only includes the index. We've traditionally included nothing. This patch fixes the portable interpreter to include both the index and the array length. Later patches will address the ARM- and x86-specific code. Change-Id: I9d0e6baacced4e1d33e6cd75965017a38571af67
b78c76f88ea42e7a3b295c210ca9ee86e7290043	01-Oct-2010	buzbee <buzbee@google.com>	GC Card marking fix for SPUT_OBJECT - use correct object head Change-Id: I8b84a4f1e1690f5b62de7404ea6ede00317848bb
d82097f6b409c5cd48568e54eb701604c3cceb18	27-Sep-2010	buzbee <buzbee@google.com>	Change GC card making to use object head, bug fix for volatile sput obj This CL changes the way we mark GC card to consistently use the object head (previously, we marked somewhere in the object - often the head, but not always). Also, previously a coding error caused us to skip the card mark for OP_APUT_OBJECT_VOLATILES. Fixed here. Change-Id: I133ef6395c51a0466c9708209b08e79c3083aff2
d3b0a4bf6b2e38e6e9e80e203ca753e941084103	27-Sep-2010	buzbee <buzbee@google.com>	Change GC card making to use object head, bug fix for volatile sput obj This CL changes the way we mark GC card to consistently use the object head (previously, we marked somewhere in the object - often the head, but not always). Also, previously a coding error caused us to skip the card mark for OP_APUT_OBJECT_VOLATILES. Fixed here. Change-Id: I53eb333b9bd0b770201af0dc617d9a8f38afa699
4934b377d9cf5df6f80da7caab4f2178c6cec307	21-Sep-2010	Ben Cheng <bccheng@android.com>	Several fixes for JIT and self-verification under corner cases. 1) Fix the self-verification mode to handle backward chaining cell properly when a single-step instruction is in the middle of the cyclic portion of the trace. Then found issue 2 when changing the JIT threshold to 1. 2) When the code cache is full, the VM may stop making forward progress and bounces back and forth between the debug and fast intepreters as the translation request is constantly rejected. The fix is to stay in the debug interpreter until the corner case condition is cleared. Then found issue 3. 3) Under self-verification mode, the code cache reset request may get delayed indefinitely due to spurious indication that a thread is running JIT'ed code. Trivial fix - make sure the inJitCodeCache flag is cleared. (cherry-picked from dalvik-dev) Change-Id: Ic0b9952c0ae545f68f7eb2ae06a82a634ab62e9e
1a7b9d7703297358d6b2276dff02eaff6586a6fd	21-Sep-2010	Ben Cheng <bccheng@android.com>	Several fixes for JIT and self-verification under corner cases. 1) Fix the self-verification mode to handle backward chaining cell properly when a single-step instruction is in the middle of the cyclic portion of the trace. Then found issue 2 when changing the JIT threshold to 1. 2) When the code cache is full, the VM may stop making forward progress and bounces back and forth between the debug and fast intepreters as the translation request is constantly rejected. The fix is to stay in the debug interpreter until the corner case condition is cleared. Then found issue 3. 3) Under self-verification mode, the code cache reset request may get delayed indefinitely due to spurious indication that a thread is running JIT'ed code. Trivial fix - make sure the inJitCodeCache flag is cleared. Change-Id: I107eb23102940df07c27c7f2b5cc22e30fbdcd1c
1df319e3674d993a07bc0ff1f56a5915410b5903	15-Sep-2010	Andy McFadden <fadden@android.com>	Use store barrier instead of full barrier. Make use of ANDROID_MEMBAR_STORE when appropriate. In mterp, define a new SMP_DMB_ST macro that will (soon) expand into "dmb st" on ARMv7-A platforms configured for SMP. Bug 3003477. Change-Id: I03c09e93e1374d1c668588c9ad52f5c08d3d2435
291758c5c4902900c6f86794ba8ab9cad9b26197	10-Sep-2010	Andy McFadden <fadden@android.com>	Add return-void-barrier instruction. This introduces the return-void-barrier instruction, which is identical to return-void on UP systems, but provides an additional store/store barrier on SMP. This is intended for use in constructors of objects with final fields. The assembler doesn't like "dmb st", and we don't have an ANDROID_MEMBAR_STORE barrier defined, so this currently uses full fences. This just defines the new instruction. It's not actually used yet. Also, removed some stale "unused" files from the x86 and x86-atom directories. Bug 2965743. Change-Id: I072e372fd2d57f2617a8d4fff5fd4b38bdda75d1
5cc61d70ec727aa22f58463bf7940cc717cf3eb1	31-Aug-2010	Ben Cheng <bccheng@android.com>	Collect method traces with the fast interpreter and the JIT'ed code. Insert inline code instead of switching to the debug interpreter in the hope that the time stamps collected in traceview are more close to the real world behavior with minimal profiling overhead. Because the inline polling still introduces additional overhead (20% ~ 100%), it is only enabled in the special VM build called "libdvm_traceview.so". It won't work on the emulator because it is not implemented to collect the detailed instruction traces. Here are some performance numbers using the FibonacciSlow microbenchmark (ie recursive workloads / the shorter the faster): time: configuration 8,162,602: profiling off/libdvm.so/JIT off 2,801,829: profiling off/libdvm.so/JIT on 9,952,236: profiling off/libdvm_traceview.so/JIT off 4,465,701: profiling off/libdvm_traceview.so/JIT on 164,786,585: profiling on/libdvm.so/JIT off 164,664,634: profiling on/libdvm.so/JIT on 11,231,707: profiling on/libdvm_traceview.so/JIT off 8,427,846: profiling on/libdvm_traceview.so/JIT on Comparing the 8,427,846 vs 164,664,634 numbers againt the true baseline performance number of 2,801,829, the new libdvm_traceview.so improves the time skew from 58x to 3x. Change-Id: I48611a3a4ff9c4950059249e5503c26abd6b138e
0d615c3ce5bf97ae65b9347ee77968f38620d5e8	18-Aug-2010	Andy McFadden <fadden@android.com>	Always support debugging and profiling. This eliminates the use of the WITH_DEBUGGER and WITH_PROFILER conditional compilation flags. We've never shipped a device without these features, and it's unlikely we ever will. They're not worth the code clutter they cause. As usual, since I can't test the x86-atom code I left that alone and added an item to the TODO list. Bug 2923442. Change-Id: I335ebd5193bc86f7641513b1b41c0378839be1fe
7a2697d327936e20ef5484f7819e2e4bf91c891f	07-Jun-2010	Ben Cheng <bccheng@android.com>	Implement method inlining for getters/setters Changes include: 1) Force the trace that ends with an invoke instruction to include the next instruction if it is a move-result (because both need to be turned into no-ops if callee is inlined). 2) Interpreter entry point/trace builder changes so that return target won't automatically be considered as trace starting points (to avoid duplicate traces that include the move result instructions). 3) Codegen changes to handle getters/setters invoked from both monomorphic and polymorphic callsites. 4) Extend/fix self-verification to form identical trace regions and handle traces with inlined callees. 5) Apply touchups to the method based parsing - still not in use. Change-Id: I116b934df01bf9ada6d5a25187510e352bccd13c
919eb063ce4542d3698e10e20aba9a2dfbdd0f82	12-Jul-2010	buzbee <buzbee@google.com>	Interpreter & JIT support for write barriers In this iteration, cards are marked on either the store address or the object head (whichever leads to faster code). In all cases, though, card marks are deferred until after the associated store has completed. Change-Id: I633d6e8c3bebdb80bde92efb4fa6fc7cc84f60fc
0890e5bf0b2a502ca1030e9773fabc16ef1b5981	18-Jun-2010	Andy McFadden <fadden@android.com>	Fiddle with SMP_DMB. This changes it from a macro that takes an argument to a simpler macro that is named explicitly by the 8 instructions that want it. Change-Id: Ie17a9722823d590851776b6b9b057eadf22fa6a8
c35a2ef53d0cccd6f924eeba36633220ec67c32e	17-Jun-2010	Andy McFadden <fadden@android.com>	Add opcodes for volatile field accesses This adds instructions for {i,s}{get,put}{,-object}-volatile, for a total of eight new instructions. On SMP systems, these instructions will be substituted in for existing field access instructions, either by dexopt or during just-in-time verification. Unlike the wide-volatile instructions, these will not be used at all when the VM is not built for SMP. (Ideally we'd omit the volatile instruction implementations entirely on non-SMP builds, but that requires a little work in gen-mterp.py.) The change defines and implements the opcodes and support methods, but does not cause them to be used. Also, changed dvmQuasiAtomicRead64's argument to be const. Change-Id: I9e44fe881e87f27aa41f6c6e898ec4402cb5493e
6e10b9aaa72425a4825a25f0043533d0c6fdbba4	15-Jun-2010	Andy McFadden <fadden@android.com>	Atomic op cleanup. Replaced VM-local macros for barrier and CAS calls with the actual versions provided by cutils. ATOMIC_CMP_SWAP(addr,old,new) --> android_atomic_release_cas(old,new,addr) MEM_BARRIER --> ANDROID_MEMBAR_FULL Renamed android_quasiatomic* to dvmQuasiAtomic*. Didn't change how anything works, just the names. Change-Id: I8c68f28e1f7c9cb832183e0918d097dfe6a2cac8
891416ef725a5d9e64e5a1422f65394068c6e106	11-Jun-2010	Andy McFadden <fadden@android.com>	Update armv7-a-neon. An externally-contributed fix was not able to update the armv7-a-neon output files, because they don't exist in the open-source repository. All I did was re-run rebuild.sh. Change-Id: I3bf436ee8c0f57deb033815fd07f1a531ce851a1
7365493ad8d360c1dcf9cd8b6eee62747af01cae	09-Jun-2010	Carl Shapiro <cshapiro@google.com>	Remove repeated newlines at the end of files. Change-Id: I1e3d103a7b932ef21acedb6438c0f26b315df28f
de75089fb7216d19e9c22cce4dc62a49513477d3	09-Jun-2010	Carl Shapiro <cshapiro@google.com>	Remove trailing whitespace. Change-Id: I95534bb2b88eaf48f2329282041118cd034c812b
8ba2708ea118381f2df5ca55b9bad2ae4c050504	21-May-2010	Andy McFadden <fadden@android.com>	Added EXPORT_PC to "throw" instruction. For bug 2700761. Change-Id: I889e59ea35d9cadd99fc884e5b1301a4cf103f93
fbdcfb9ea9e2a78f295834424c3f24986ea45dac	29-May-2010	Brian Carlstrom <bdc@google.com>	Merge remote branch 'goog/dalvik-dev' into dalvik-dev-to-master Change-Id: I0c0edb3ebf0d5e040d6bbbf60269fab0deb70ef9
bd0472480c6e876198fe19c4ffa22350c0ce57da	13-May-2010	Bill Buzbee <buzbee@google.com>	JIT: Fix for [Issue 2675245] FRF40 monkey crash in jit-cache The JIT's chaining mechanism suffered from a narrow window that could result in i-cache inconsistency. One of the forms of chaining cell consisted of a two 16-bit thumb instruction sequence. If a thread were interrupted between the execution of those two instructions and another thread picked that moment to convert that cell's chained/unchained state, then bad things happen. This CL alters the chain/unchain model somewhat to avoid this case. Chainable chaining cells grow by 4 bytes each, and instead of rewriting a 32-bit cell to chain/unchain, we switch between chained and unchained state by [re]writing the first 16-bits of the cell as either a 16-bit Thumb unconditional branch (unchained mode) or the first half of a 32-bit Thumb branch. The 2nd 16-bits of the cell will never change once the cell moves from its inital state - thus avoiding the possibility of it becoming inconsistent. This adds a trivial execution penalty on the slow path, but will add about a kByte of memory usage to a typical process. Change-Id: Id8b99802e11386cfbab23da6abae10e2d9fc4065
978738d2cbf9d08fa78c65762eaac3351ab76b9a	13-May-2010	Ben Cheng <bccheng@android.com>	Add counters to track JIT inline cache hit rate and code cache patch counts. Also did some WITH_JIT_TUNING cleanup. Change-Id: I8bb2d681a06b0f2af1f976a007326825a88cea38
c95e0fbce4f77b2b08eb48205e405793de0d4248	29-Apr-2010	Andy McFadden <fadden@android.com>	Rework common_periodicChecks. The function was rewritten to optimize the common path. The control flow now matches the C version, which tests for debugger/profiler even if the previous test for suspension came up true. This also adds a minor optimization on the test for debugger attachment, allowing us to skip a load from memory if the process is simply not debuggable. (The optimization isn't yet enabled because a similar change must be made to the x86 asm code.) The VM apparently hadn't been built without debugging/profiling support for a while, so this fixes those places (necessary to be able to test all forms of the new code). Bug 2634642. Change-Id: I096b58c961bb73ee0d128ba776d68dbf29bba924
7a44e4ee0782d24b4c6090be1f0a3c66f971f2c1	29-Apr-2010	Andy McFadden <fadden@android.com>	Use unsigned compare for stack overflow. When checking for stack overflow we're using a comparison that is treating the pointers as signed values. If we manage to get a stack straddling 0x80000000, this will not work correctly. Bug 2613607. Change-Id: I5d178db86e93a3bb1e6a417e88d7cb1770d285bb
d5adae17d71e86a1a5f3ae7825054e3249fb7879	27-Mar-2010	Ben Cheng <bccheng@android.com>	Improve JIT self verifier test coverage to follow single-step instructions. Bug: 2549326 Change-Id: I01412d4aac1379b61c90fe6e59c534b33be93f66
861b33855aff080278ea5125e4372a2d4bf8aef5	06-Mar-2010	Andy McFadden <fadden@android.com>	Make wide-volatile loads and stores atomic. This implements the four wide-volatile instructions added in a previous change, and modifies the verifier to substitute the opcodes into the instruction stream when appropriate. For mterp, the ARM wide get/put instructions now have conditional code that replaces ldrd/strd with a call to the quasiatomic functions. The C version does essentially the same thing. ARMv4T lacks ldrd/stdrd, and uses separate implementations for the wide field accesses, so those were updated as well. x86 will just use stubs. The JIT should punt these to the interpreter. Change-Id: Ife88559ed1a698c3267d43c454896f6b12081c0f Also: - We don't seem to be using the negative widths in the instruction table. Not sure they're useful anymore. - Tabs -> spaces in x86-atom throw-verification-error impl.
51ae442fa9ed49e081e58e5127d1805789dbb196	13-Mar-2010	Bill Buzbee <buzbee@google.com>	Jit: Minor cleanup - enum size fix, remove useless code, control consistency. Change-Id: Id8c16303efd25683ad4b04a85e0d2a059b5ec3be
fd7e221cce6d3c63fd26599d58e0a35db7f5d1fa	09-Mar-2010	Colin Cross <ccross@android.com>	Add armv7-a-neon build target Change-Id: I981d55b53f6b3c185fe93384924bdbe18057132c