f3f9cf6b65c4bcf9ea44253188d8d910b7cf7e64 |
|
14-Apr-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Add weight to compiled/interpreter transitions. Also: - Cleanup logging. - Check ArtMethod status before adding compilation requests. - Don't request osr compilation if we know AddSamples does not come from a back edge. Bug: 27865109 (cherry picked from commit 71cd50fb67fa48667b0ab59aa436a582c04ba43d) Change-Id: Icbe89fe6cc495b113616391a8f257758d34b4b60
|
8a06497868d5b5cb990a04fbd8ab20b3edec139c |
|
04-Apr-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "ART: Improve JitProfile perf in x86_64 mterp"" Bug: 28080135 Bug triggering original revert fixed by: https://android-review.googlesource.com/#/c/214728 This CL additionally corrects a secondary bug in argument setup appearing in both x86 and x86_64 versions. This reverts commit 0402c5690b1a961e923a39dab92ec1ee0b54b05a. (cherry picked from commit 9afaac4ccdd90774cf95ce6fc42d9c6df4c8b817) Change-Id: If86a5d43469d8a958e007acc0afe924330de5c16
|
bb11c8b1219f5b4b3154c2c83fca19ec8add6646 |
|
12-Apr-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Remove the JIT from the instrumentation framework. This was slowing down the interpreter for no reason. Also, call AddSamples for invoke-static and invoke-direct. bug:27865109 (cherry picked from commit 274fe4adcb0610a9920be7814d9beb9cac6417ce) Change-Id: I3519456ac8e0c7211cbe3f12e88d134beee87479
|
b2771b41a956b50266d4d83fbb067f99faf7b7dc |
|
07-Apr-2016 |
Calin Juravle <calin@google.com> |
Add option to tune sample collection based on thread sensitivity Bug: 28065407 Bug: 27865109 Change-Id: Icdb89f8f8874a41c07e73185523d18e8956620d3
|
0e6aa6d945048345dee93f87070df6a62b31f680 |
|
11-Apr-2016 |
buzbee <buzbee@google.com> |
ART: Make mterp jit profiling race tolerant The JIT profiling mechanism is intentionally non-precise to minimize performance overhead. In general, this is not a problem. However, the on-stack replacement mechanism assumes an order of method compilation than can sometimes be violated if conditions are just right. This change allows compilation requests that were dropped due to a race condition to eventually be re-issued. It does this by allowing the 16-bit hotness counter to wrap around. Change-Id: I2ac8056af8c4f7f8cef3f2c3db70b0394c26a566
|
1d011d9306fd4ff57d72411775d415a86f5ed398 |
|
04-Apr-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "Revert "Revert "ART: Improve JitProfile perf in arm/arm64 mterp"""" Bug: 28081559 This reverts commit 961ea9fe42edcc2c57469bf451d1ca421da5cd59. Change-Id: I98a5bb8112646706ae7bd73bf6393cb956466be3
|
d63420863dd2ac68881c03f953275e9569815c8e |
|
04-Apr-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "ART: Improve JitProfile perf in x86 mterp"" This reverts commit 6b7d2c09b4710503a72ff5de31bff5cb23a3a921. Change-Id: I7c7a9a7e8a1dd03d2487fb59b1ea8fcb0c36aeb2
|
961ea9fe42edcc2c57469bf451d1ca421da5cd59 |
|
01-Apr-2016 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "Revert "Revert "ART: Improve JitProfile perf in arm/arm64 mterp""" This reverts commit 4a8ac9cee4312ac910fabf31c64d28d4c8362836. 570-checker-osr intermittently failing. Bug: 27939339
|
6b7d2c09b4710503a72ff5de31bff5cb23a3a921 |
|
01-Apr-2016 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "ART: Improve JitProfile perf in x86 mterp" This reverts commit 3e9edd1c63c1760f1bcffdbeaf721ebe3320f386. 570-checker-osr intermittently failing. Bug: 27939339
|
0402c5690b1a961e923a39dab92ec1ee0b54b05a |
|
01-Apr-2016 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revert "ART: Improve JitProfile perf in x86_64 mterp" This reverts commit 099a611a418df6f0695e3bcd32fe896043ca1398. 570-checker-osr intermittently failing. Bug: 27939339 Change-Id: I9f1b4139118b1d803ea9c21319c3147d2f40fec9
|
d6190dcdfb1f247ff0bf8268dacf413c93b58cf5 |
|
31-Mar-2016 |
Calin Juravle <calin@google.com> |
Revert "Revert "Revert "ART: Improve JitProfile perf in arm/arm64 mterp""" This reverts commit 4a8ac9cee4312ac910fabf31c64d28d4c8362836. Bug: 27939339
|
099a611a418df6f0695e3bcd32fe896043ca1398 |
|
29-Mar-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART: Improve JitProfile perf in x86_64 mterp Change-Id: Ieae39e2cc8de8d381e6f9de0faa440c90e20a7a5 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
3e9edd1c63c1760f1bcffdbeaf721ebe3320f386 |
|
28-Mar-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART: Improve JitProfile perf in x86 mterp Change-Id: Id4c1e52352da8f6b7ce2008bc4adf52bc08847b2 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
4a8ac9cee4312ac910fabf31c64d28d4c8362836 |
|
25-Mar-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "ART: Improve JitProfile perf in arm/arm64 mterp"" Ready for review. This reverts commit 6aef867f4d1a98a12bcdd65e9bf2ff894f0f2d7e. Change-Id: I5d53ed2bedc7e429ce7d3cdf80b6696a9628740e
|
6aef867f4d1a98a12bcdd65e9bf2ff894f0f2d7e |
|
25-Mar-2016 |
Calin Juravle <calin@google.com> |
Revert "ART: Improve JitProfile perf in arm/arm64 mterp" This reverts commit c1d6b341eed646e5adafc6c4fd4e3748f0292368.
|
c1d6b341eed646e5adafc6c4fd4e3748f0292368 |
|
02-Mar-2016 |
buzbee <buzbee@google.com> |
ART: Improve JitProfile perf in arm/arm64 mterp ART currently requires two profiling-related things from the interpreters: hotness updates and OSR switch checks. The hotness updates previously used the existing instrumentation framework - which is flexible, but quite heavyweight. For most things, the instrumentation framework overhead is acceptable, but because we do a hotness update on every backwards branch the overhead is unacceptable. Prior to this CL, branch profiling dominates interpreter cost. Here, we bypass the instrumentation framework for hotness updates and deliver a significant performance improvement. Running interpreter-only (dalvikvm -Xint) on a Nexus 6, we see the logic subtest of Caffeinemark improving from 2600 to 9200, and the overall score going from 1979 to over 3000. Compared to the C++ switch interpreter, we see a 6x improvement on the branchy logic subtest and a 2.6x improvement overall. Compared with the previous mterp which did not have support for jit profiling, we see a few (1% to 5%) performance loss on the standard command-line benchmarks. I consider this acceptable (we could create an alternate non-profiling mterp which would have no penalty, but I don't consider this overhead big enough to justify that). Change-Id: I50b5b8c5ed8ebda3c8b4e65d27ba7393c3feae04
|
db045bea24d28ce6ad932fec4ce055af7be530e2 |
|
04-Mar-2016 |
Alexey Frunze <Alexey.Frunze@imgtec.com> |
ART: Enable JitProfiling for MIPS64 Mterp Change-Id: I46bdbfd706569ebbb1d1b08b9060ff01518d0f3a
|
c8705a7801338b85cf9a8f8908b9e92a3283b114 |
|
26-Feb-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
ART: Enable JitProfiling for x86_64 Mterp Adds branch profiling and enables for x86_64. Support interpreter switching in x86_64 mterp. Change-Id: I0cb9fcf3e2a01e411d84efc78449e86c10e6bcac Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
200f040af3e4fe9e178cb63c90860d58d90ef665 |
|
26-Feb-2016 |
Douglas Leung <douglas.leung@imgtec.com> |
[MIPS] Add Fast Art interpreter for Mips32. Change-Id: I6b9714dc8c01b8c9080bcba175faec1d2de08f8f
|
2de973de1094b6598d4d2b3457a3490d40c5d4fb |
|
23-Feb-2016 |
buzbee <buzbee@google.com> |
ART: Enable JitProfiling for x86 Mterp Adds branch profiling and enables for x86. Change-Id: I875034d5bc6b639df08a0236e415195521994238
|
9fb0ac70e4627be7113533cc126483117bfca068 |
|
19-Feb-2016 |
Serguei Katkov <serguei.i.katkov@intel.com> |
Enable bytecode tracing in ART FI Trace bytecode execution in Fast Interpreter similar to other interpreters. Update TraceExecutionEnabled function to switch on tracing. Change-Id: Icabc17871c8198b11cd4c3dbfaa901e4fbf67946 Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
c3b4c6e933160198d70668cace87d614112a63da |
|
19-Feb-2016 |
buzbee <buzbee@google.com> |
ART: Enable JitProfiling for Arm Mterp Also, fix missing shadow frame clear operation for the 64-bit shift operations. Change-Id: Icea95b3aeb1d6d36ea92336fb738cf56edd92da4
|
fd522f9039befff986701ff05054ffdd1be1dd33 |
|
11-Feb-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "Revert "Revert "ART: Enable Jit Profiling in Mterp for arm/arm64"""" This reverts commit 5d03317a834efdf3b5240c401f1bc2ceac7a2f25. We need to catch all possible cases in which new instrumentation appears or the debugger is attached, and then switch to the reference interpreter if necessary. We may, in a future CL, use the alt-mterp mechanism to accompish this (as did Dalvik). Only enables Arm64 for now. Once it survives extended testing, will enable arm and update x86. Updated OSR handling to match other interpreters. Change-Id: I076f1d752d6f59899876bab26b18e2221cd92f69
|
5d03317a834efdf3b5240c401f1bc2ceac7a2f25 |
|
11-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Revert "ART: Enable Jit Profiling in Mterp for arm/arm64""" Unfortunately, run-test interpreter on arm32 are still timing out, and the following jdwp tests on armv8 are failing: org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testBreakpoint_BeforeException (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testFieldAccess (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testFieldModification (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testMethodExit (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testMethodExitWithReturnValue (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.FieldAccessTest#testFieldAccessEvent (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.FieldModification002Test#testFieldModifyEvent (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.FieldModificationTest#testFieldModifyEvent (no test history available) org.apache.harmony.jpda.tests.jdwp.Events.MethodExitWithReturnValueTest#testMethodExitWithReturnValueException (no test history available) This reverts commit 9687f244bdb5dd0b4d9dd804a7c8c7b4a911d364. Change-Id: Iadac4902ab8d7eb574cc4abeba5f93388d59dcb4
|
9687f244bdb5dd0b4d9dd804a7c8c7b4a911d364 |
|
05-Feb-2016 |
Bill Buzbee <buzbee@google.com> |
Revert "Revert "ART: Enable Jit Profiling in Mterp for arm/arm64"" Fixes: missing sign extension in iget template Call to wrong branch profiling helper in arm/goto_16 and arm/goto_32 Missing export PCs Reworks: Branch handlers to reduce cost of branch profiling. Re-enables Jit profiling for both Arm and Arm64. Performance note: Branch profiling is relatively expensive, though the real cost will depend on branch frequency. Taking a very branch intensive benchmark, CaffeineMark's logic test, we see the following scores (higher is better): Mterp (profiling off) 6187 Mterp (profiling on) 4305 Switch (profiling off) 3931 Switch (profiling on) 2032 This reverts commit 95717f0010e7a9445450f4d39babfaf3a83e29b5. Change-Id: Ia2ef8b54ce95bfa86178b89c43f8a703316b2944
|
95717f0010e7a9445450f4d39babfaf3a83e29b5 |
|
05-Feb-2016 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "ART: Enable Jit Profiling in Mterp for arm/arm64" Not super happy to revert this, but unfortunately, too many problems when testing: arm: tests timeout when running run-tests with the interpreter. arm64 failures: test-art-target-run-test-ndebug-prebuild-jit-relocate-ntrace-cms-checkjni-image-npictest-ndebuggable-003-omnibus-opcodes64 test-art-target-run-test-ndebug-prebuild-jit-relocate-ntrace-cms-checkjni-image-npictest-ndebuggable-005-annotations64 test-art-target-run-test-ndebug-prebuild-jit-relocate-ntrace-cms-checkjni-image-npictest-ndebuggable-064-field-access64 test-art-target-run-test-ndebug-prebuild-jit-relocate-ntrace-cms-checkjni-image-npictest-ndebuggable-406-fields64 org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testBreakpoint_BeforeException org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testFieldAccess org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testFieldModification org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testMethodExit org.apache.harmony.jpda.tests.jdwp.Events.EventWithExceptionTest#testMethodExitWithReturnValue org.apache.harmony.jpda.tests.jdwp.Events.FieldAccessTest#testFieldAccessEvent org.apache.harmony.jpda.tests.jdwp.Events.FieldModification002Test#testFieldModifyEvent org.apache.harmony.jpda.tests.jdwp.Events.FieldModificationTest#testFieldModifyEvent org.apache.harmony.jpda.tests.jdwp.Events.MethodExitWithReturnValueTest#testMethodExitWithReturnValueException This reverts commit a0a16105423459287497a98129dcba2828ccd7f0. Change-Id: I8ff0512265ed0a422be67e7410998ad02639509c
|
a0a16105423459287497a98129dcba2828ccd7f0 |
|
04-Feb-2016 |
buzbee <buzbee@google.com> |
ART: Enable Jit Profiling in Mterp for arm/arm64 Adds the hooks for branch profiling to arm and arm64. The other Jit profiling modes are handled in common code. Stubbed out support for on-stack replacement. Change-Id: Ic298a81139108c3d7f1325b59d97e14a9de08de6
|
a2c97a94ee9379b23204bfef87afacd4b60cae37 |
|
26-Jan-2016 |
buzbee <buzbee@google.com> |
[WIP] ART Mterp: fix for hidden gc roots To support moving gc, we must not hold onto any references solely in registers across potential gc points. This was happening during returns, instance-of and check-cast. [testing in progress] Change-Id: I367750658c5716960737f0666e46800240fd392d
|
76833da70bb9e493201a675d2718dca0f2cc256c |
|
13-Jan-2016 |
buzbee <buzbee@google.com> |
ART: Mterp read barrier fix + minor cleanup Read barrier support relies on hooks in common code for loading object references. Mterp missed doing this for iget-object-quick. Also, added missing conditional assembly around debug event logging for mterp fallback and deleted an unnecessary store. Bug: 26510411 Change-Id: I2d5b27c4090be58d3cfcb14309d14ccabf04a6f5
|
1452bee8f06b9f76a333ddf4760e4beaa82f8099 |
|
06-Mar-2015 |
buzbee <buzbee@google.com> |
Fast Art interpreter Add a Dalvik-style fast interpreter to Art. Three primary deficiencies in the existing Art interpreter will be addressed: 1. Structural inefficiencies (primarily the bloated fetch/decode/execute overhead of the C++ interpreter implementation). 2. Stack memory wastage. Each managed-language invoke adds a full copy of the interpreter's compiler-generated locals on the shared stack. We're at the mercy of the compiler now in how much memory is wasted here. An assembly based interpreter can manage memory usage more effectively. 3. Shadow frame model, which not only spends twice the memory to store the Dalvik virtual registers, but causes vreg stores to happen twice. This CL mostly deals with #1 (but does provide some stack memory savings). Subsequent CLs will address the other issues. Current status: Passes all run-tests. Phone boots interpret-only. 2.5x faster than Clang-compiled Art goto interpreter on fetch/decode/execute microbenchmark, 5x faster than gcc-compiled goto interpreter. 1.6x faster than Clang goto on Caffeinemark overall 2.0x faster than Clang switch on Caffeinemark overall 68% of Dalvik interpreter performance on Caffeinemark (still much slower, primarily because of poor invoke performance and lack of execute-inline) Still nearly an order of magnitude slower than Dalvik on invokes (but slightly better than Art Clang goto interpreter. Importantly, saves ~200 bytes of stack memory per invoke (but still wastes ~400 relative to Dalvik). What's needed: Remove the (large quantity of) bring-up hackery in place. Integrate into the build mechanism. I'm still using the old Dalvik manual build step to generate assembly code from the stub files. Remove the suspend check hack. For bring-up purposes, I'm using an explicit suspend check (like the other Art interpreters). However, we should be doing a Dalvik style suspend check via the table base switch mechanism. This should be done during the alternative interpreter activation. General cleanup. Add CFI info. Update the new target bring-up README documentation. Add other targets. In later CLs: Consolidate mterp handlers for expensive operations (such as new-instance) with the code used by the switch interpreter. No need to duplicate the code for heavyweight operations (but will need some refactoring to align). Tuning - some fast paths needs to be moved down to the assembly handlers, rather than being dealt with in the out-of-line code. JIT profiling. Currently, the fast interpreter is used only in the fast case - no instrumentation, no transactions and no access checks. We will want to implement fast + JIT-profiling as the alternate fast interpreter. All other cases can still fall back to the reference interpreter. Improve invoke performance. We're nearly an order of magnitude slower than Dalvik here. Some of that is unavoidable, but I suspect we can do better. Add support for our other targets. Change-Id: I43e25dc3d786fb87245705ac74a87274ad34fedc
|