18fba346582c08d81aa96d9508c0e935bad5f36f |
|
20-Jan-2011 |
buzbee <buzbee@google.com> |
Support traceview-style profiling in all builds This change builds on an earlier bccheng change that allowed JIT'd code to avoid reverting to the debug portable interpeter when doing traceview-style method profiling. That CL introduced a new traceview build (libdvm_traceview) because the performance delta was too great to enable the capability for all builds. In this CL, we remove the libdvm_traceview build and provide full-speed method tracing in all builds. This is done by introducing "_PROF" versions of invoke and return templates used by the JIT. Normally, these templates are not used, and performace in unaffected. However, when method profiling is enabled, all existing translation are purged and new translations are created using the _PROF templates. These templates introduce a smallish performance penalty above and beyond the actual tracing cost, but again are only used when tracing has been enabled. Strictly speaking, there is a slight burden that is placed on invokes and returns in the non-tracing case - on the order of an additional 3 or 4 cycles per invoke/return. Those operations are already heavyweight enough that I was unable to measure the added cost in benchmarks. Change-Id: Ic09baf4249f1e716e136a65458f4e06cea35fc18
|
2e152baec01433de9c63633ebc6f4adf1cea3a87 |
|
16-Dec-2010 |
buzbee <buzbee@google.com> |
[JIT] Trace profiling support In preparation for method compilation, this CL causes all traces to include two entry points: profiling and non-profiling. For now, the profiling entry will only be used if dalvik is run with -Xjitprofile, and largely works like it did before. The difference is that profiling support no longer requires the "assert" build - it's always there now. This will enable us to do a form of sampling profiling of traces in order to identify hot methods or hot trace groups, while keeping the overhead low by only switching profiling on periodically. To turn the periodic profiling on and off, we simply unchain all existing translations and set the appropriate global profile state. The underlying translation lookup and chaining utilties will examine the profile state to determine which entry point to use (i.e. - profiling or non-profiling) while the traces naturally rechain during further execution. Change-Id: I9ee33e69e33869b9fab3a57e88f9bc524175172b
|
88dc28740f628a8d0d8fe05af0e11443f8793aa1 |
|
03-Feb-2010 |
jeffhao <jeffhao@google.com> |
Made Self Verification mode's memory interface less intrusive.
|
9e45c0b968d63ea38353c99252d233879c2efdaf |
|
03-Feb-2010 |
jeffhao <jeffhao@google.com> |
Made Self Verification mode's memory interface less intrusive.
|
f5ceaebfe5633a16b11a7073d2bf36b5bb0c9945 |
|
02-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework monitor enter/exit to simplify thread suspension The Jit must stop all threads in order to flush the translation cache (and other tables). Threads which are blocked in a monitor wait cause some headache here because they effectively hold a references to the translation cache (though the return address on the native stack). The new model introduced in this CL is that for the fast path of monitor enter, control is allowed to resume in the translation cache. However, if we need to do a heavyweight lock (which may cause us to block) control does not return to the translation cache but instead bails out to the interpreter. This allows us to safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
c1d9ed490a7bd6caab51df41f3c9e590fcecb727 |
|
02-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework monitor enter/exit to simplify thread suspension The Jit must stop all threads in order to flush the translation cache (and other tables). Threads which are blocked in a monitor wait cause some headache here because they effectively hold a references to the translation cache (though the return address on the native stack). The new model introduced in this CL is that for the fast path of monitor enter, control is allowed to resume in the translation cache. However, if we need to do a heavyweight lock (which may cause us to block) control does not return to the translation cache but instead bails out to the interpreter. This allows us to safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
72e93344b4d1ffc71e9c832ec23de0657e5b04a5 |
|
13-Nov-2009 |
Jean-Baptiste Queru <jbq@google.com> |
eclair snapshot
|
9a8c75adb2abf551d06dbf757bff558c1feded08 |
|
08-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Introduce "just interpret" chainable pseudo-translation. This is the first step towards enabling translation & self-cosim stress modes. When trace selection begins, the trace head address is pinned and remains in a limbo state until the translation is complete. Previously, if the trace selected aborted for any reason, the trace head would remain forever in limbo. This was not a correctness problem, but caused some small performance anomolies and made life more difficult for self-cosimulation mode. This CL introduces a pseudo-translation that simply routes control to the interpreter. When we detect that a trace selection attempt has failed, the trace head is associated with this fully-chainable pseudo-translation. This also has the benefit for self-cosimulation that we are guaranteed forward progress.
|
fd023aaec5f2b0df61d1702ea2f29a70abe90158 |
|
02-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit - optimized inline string compareto, indexof; fill_array_data bug fix Added flushAllRegs() prior to C handlers in preparation for upcoming support for holding live/dirty values in physical registers.
|
1465db5ee2d3c4c4dcc8e017a294172e858765cb |
|
24-Sep-2009 |
Bill Buzbee <buzbee@google.com> |
Major registor allocation rework - stage 1. Direct usage of registers abstracted out. Live values tracked locally. Redundant loads and stores suppressed. Address of registers and register pairs unified w/ single "location" mechanism Register types inferred using existing dataflow analysis pass. Interim (i.e. Hack) mechanism for storing register liveness info. Rewrite TBD. Stubbed-out code for linear scan allocation (for loop and long traces) Moved optimistic lock check for monitor-enter/exit inline for Thumb2 Minor restructuring, renaming and general cleanup of codegen Renaming of enums to follow coding convention Formatting fixes introduced by the enum renaming Rewrite of RallocUtil.c and addition of linear scan to come in stage 2.
|
4f48917c0741e4d9b15ca7c45956aea05fea103f |
|
28-Sep-2009 |
Ben Cheng <bccheng@google.com> |
Fixed OOM exception handling in JIT'ed code and added a new unit test.
|
50a6bf2f01efba0acbff9bb03e7ee09688553e08 |
|
08-Jul-2009 |
Bill Buzbee <buzbee@google.com> |
Inline-execute for Java.Lang.Math routines, jit codegen restructure, various bug fixes.
|