0e54c0160c84894696c05af6cad9eae3690f9496 |
|
04-Mar-2016 |
Aart Bik <ajcbik@google.com> |
Unsafe: Recognize intrinsics for 1.8 java.util.concurrent With unit test. Rationale: Recognizing the 1.8 methods as intrinsics is the first step towards providing efficient implementation on all architectures. Where not implemented (everywhere for now), the methods fall back to the JNI native or reference implementation. NOTE: needs iam's CL first! bug=26264765 Change-Id: Ife65e81689821a16cbcdd2bb2d35641c6de6aeb6
|
2a6aad9d388bd29bff04aeec3eb9429d436d1873 |
|
25-Feb-2016 |
Aart Bik <ajcbik@google.com> |
Implement fp to bits methods as intrinsics. Rationale: Better optimization, better performance. Results on libcore benchmark: Most gain is from moving the invariant call out of the loop after we detect everything is a side-effect free intrinsic. But generated code in general case is much cleaner too. Before: timeFloatToIntBits() in 181 ms. timeFloatToRawIntBits() in 35 ms. timeDoubleToLongBits() in 208 ms. timeDoubleToRawLongBits() in 35 ms. After: timeFloatToIntBits() in 36 ms. timeFloatToRawIntBits() in 35 ms. timeDoubleToLongBits() in 35 ms. timeDoubleToRawLongBits() in 34 ms. bug=11548336 Change-Id: I6e001bd3708e800bd75a82b8950fb3a0fc01766e
|
38e9e8046ea2196284bdb4638771c31108a30a4a |
|
18-Feb-2016 |
Jean-Philippe Halimi <jean-philippe.halimi@intel.com> |
Add statistics support for some optimizations This patch adds support for the --dump-stats facility with some optimizations and fixes all build issues introduced by the patch: I68751b119a030952a11057cb651a3c63e87e73ea (which got reverted) Change-Id: I5af1f2a8cced0a1a55c2bb4d8c88e6f0a24ec879 Signed-off-by: Jean-Philippe Halimi <jean-philippe.halimi@intel.com>
|
f8b3b8bc37fb04d8ae113ae6bfcf4de2f5a700d4 |
|
04-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Try to substitute constructor chains for IPUTs. Match a constructor chain where each constructor either forwards some or all of its arguments to the next (i.e. superclass constructor or a constructor in the same class) and may pass extra zeros (of any type, including null), followed by any number of IPUTs on "this", storing either arguments or zeros, until we reach the contructor of java.lang.Object. When collecting IPUTs from the constructor chain, remove any IPUTs that store the same field as an IPUT that comes later. This is safe in this case even if those IPUTs store volatile fields because the uninitialized object reference wasn't allowed to escape yet. Also remove any IPUTs that store zero values as the allocated object is already zero initialized. (cherry picked from commit 354efa6cdf558b2331e8fec539893fa51763806e) Change-Id: I691e3b82e550e7a3272ce6a81647c7fcd02c01b1
|
354efa6cdf558b2331e8fec539893fa51763806e |
|
04-Feb-2016 |
Vladimir Marko <vmarko@google.com> |
Try to substitute constructor chains for IPUTs. Match a constructor chain where each constructor either forwards some or all of its arguments to the next (i.e. superclass constructor or a constructor in the same class) and may pass extra zeros (of any type, including null), followed by any number of IPUTs on "this", storing either arguments or zeros, until we reach the contructor of java.lang.Object. When collecting IPUTs from the constructor chain, remove any IPUTs that store the same field as an IPUT that comes later. This is safe in this case even if those IPUTs store volatile fields because the uninitialized object reference wasn't allowed to escape yet. Also remove any IPUTs that store zero values as the allocated object is already zero initialized. Change-Id: If93022310bf04fe38ee741665ac4a65d4c2bb25f
|
59c9454b92c2096a30a2bbdffb64edf33dbdd916 |
|
25-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Recognize common utilities as intrinsics. Rationale: Recognizing these method calls as intrinsics already has major advantages (compiler knows about no-side-effects/no-throw properties). Next step is, of course, to implement these with native instructions on each architecture. Change-Id: I06fd12973238caec00d67b31b195d7f8807a538e
|
3f67e692860d281858485d48a4f1f81b907f1444 |
|
15-Jan-2016 |
Aart Bik <ajcbik@google.com> |
Implemented BitCount as an intrinsic. With unit test. Rationale: Recognizing this important operation as an intrinsic has various advantages: (1) having the no-side-effects/no-throw allows for much more GVN/LICM/BCE. (2) Some architectures, like x86_64, provide direct support for this operation. Performance improvements on X86_64: CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35% ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69% Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
|
5d75afe333f57546786686d9bee16b52f1bbe971 |
|
14-Dec-2015 |
Aart Bik <ajcbik@google.com> |
Improved side-effects/can-throw information on intrinsics. Rationale: improved side effect and exception analysis gives many more opportunities for GVN/LICM/BCE. Change-Id: I8aa9b757d77c7bd9d58271204a657c2c525195b5
|
a4f1220c1518074db18ca1044e9201492975750b |
|
06-Aug-2015 |
Mark Mendell <mark.p.mendell@intel.com> |
Optimizing: Add direct calls to math intrinsics Support the double forms of: cos, sin, acos, asin, atan, atan2, cbrt, cosh, exp, expm1, hypot, log, log10, nextAfter, sinh, tan, tanh Add these entries to the vector addressed off the thread pointer. Call the libc routines directly, which means that we have to implement the native ABI, not the ART one. For x86_64, that includes saving XMM12-15 as the native ABI considers them caller-save, while the ART ABI considers them callee-save. We save them by marking them as used by the call to the math function. For x86, this is not an issue, as all the XMM registers are caller-save. Other architectures will call Java as before until they are ready to implement the new intrinsics. Bump the OAT version since we are incompatible with old boot.oat files. Change-Id: Ic6332c3555c09393a17d1ad4daf62932488722fb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
e523423a053af5cb55837f07ceae9ff2fd581712 |
|
02-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Revert "Don't use the compiler driver for method resolution."" This reverts commit c88ef3a10c474045a3476a02ae75d07ddd3230b7. Change-Id: I0ed88a48b313a8d28bc39fae40631123aadb13ef
|
c88ef3a10c474045a3476a02ae75d07ddd3230b7 |
|
01-Dec-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Don't use the compiler driver for method resolution." Fails 425 in debuggable mode. This reverts commit 4db0bf9c4db6a09716c3388b7d2f88d534470339. Change-Id: I346df8f75674564fc4fb241c60f23e250fc7f0a7
|
4db0bf9c4db6a09716c3388b7d2f88d534470339 |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Don't use the compiler driver for method resolution. The compiler driver makes assumptions that don't hold for the optimizing compiler, and will for example always go to slow path for an invoke-super when there's no verified method. Also fix GenerateInvokeVirtual in the presence of intrinsics. Next change will address some of the TODOs in sharpening.cc. Change-Id: I2b0e543ee9b9bebcadb2d26de29e850c59ad58b9
|
e34648dec914453f7e8b6c517dd272823319cd6d |
|
23-Nov-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Add stats support for existing optimizations" Breaks the build. Please ensure your changes build. This reverts commit 06241b1b07fb031b7d2cf55f4b78d3444d07cc2d. Change-Id: I68b18f99a9882719bf6654d3313531a7965b8483
|
06241b1b07fb031b7d2cf55f4b78d3444d07cc2d |
|
03-Sep-2015 |
Jean-Philippe Halimi <jean-philippe.halimi@intel.com> |
Add stats support for existing optimizations This patch adds support for the --dump-stats facility with existing optimizations. Change-Id: I68751b119a030952a11057cb651a3c63e87e73ea Signed-off-by: Jean-Philippe Halimi <jean-philippe.halimi@intel.com>
|
16ba2b4726cafc2d83cae4a65132aac15f372689 |
|
02-Nov-2015 |
Chris Larsen <chris.larsen@imgtec.com> |
MIPS32: java.lang.String.equals Add intrinsic support for String.equals on MIPS32. Change-Id: I2d184aa4d5dae7cdd4a89c2c902535692c9e7393
|
ee3cf0731d0ef0787bc2947c8e3ca432b513956b |
|
06-Oct-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Intrinsify System.arraycopy. Currently on x64, will do the other architectures in different changes. Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
|
3039e381b79ac1ef01c420511f6629f639d40ab4 |
|
26-Aug-2015 |
Chris Larsen <chris.larsen@imgtec.com> |
MIPS64: Implement miscellaneous bit manipulation intrinsics // java.lang.Double - doubleToRawLongBits(double) - longBitsToDouble(long) // java.lang.Float - floatToRawIntBits(float) - intBitsToFloat(int) // java.lang.Integer - numberOfLeadingZeros(int) - reverseBytes(int) - reverse(int) // java.lang.Long - numberOfLeadingZeros(long) - reverseBytes(long) - reverse(long) // java.lang.Short - reverseBytes(short) Change-Id: Ic8f8c4e7b584132e2282b4fd267453870fefbaaa
|
9ee23f4273efed8d6378f6ad8e63c65e30a17139 |
|
23-Jul-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
ARM/ARM64: Intrinsics - numberOfTrailingZeros, rotateLeft, rotateRight Change-Id: I2a07c279756ee804fb7c129416bdc4a3962e93ed
|
bfb5ba90cd6425ce49c2125a87e3b12222cc2601 |
|
01-Sep-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Revert "Do a second check for testing intrinsic types."" This reverts commit a14b9fef395b94fa9a32147862c198fe7c22e3d7. When an intrinsic with invoke-type virtual is recognized, replace the instruction with a new HInvokeStaticOrDirect. Minimal update for dex-cache rework. Fix includes. Change-Id: I1c8e735a2fa7cda4419f76ca0717125ef236d332
|
a14b9fef395b94fa9a32147862c198fe7c22e3d7 |
|
25-Aug-2015 |
Andreas Gampe <agampe@google.com> |
Revert "Do a second check for testing intrinsic types." This reverts commit 4daa0b4c21eee46362b5114fb2c3800c0c7e7a36. If the intrinsic has a slow-path, like charAt, the slow-path logic will complain as it only understands direct slow-paths, not virtual calls. We should either override that decision in the slow-path, or replace the HInvokeVirtual when we're overriding the intrinsic choice. Bug: 23475673 Change-Id: If55fbc8c82d52e0e7a7aec2674ae2bd2b74b5c77
|
4daa0b4c21eee46362b5114fb2c3800c0c7e7a36 |
|
20-Aug-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do a second check for testing intrinsic types. This allows to intrinsify calls made in a different dex file. Can't easily write a test because it depends on having inlined a method from boot classpath that calls an intrinsic. Once String.equals is implemented with the hybrid approach we can write one. Change-Id: I591d9496e236429943d6bfa7f8b20f576b1cfb9a
|
05f2056b4f11e0b2bac92b2655abe7030771f5dc |
|
19-Aug-2015 |
Agi Csaki <agicsaki@google.com> |
Add support to indicate whether intrinsics require a dex cache A structural change to indicate whether a given intrinsic requires access to a dex cache. I updated the needs_environment_ field to indicate whether an HInvoke needs an environment or a dex cache, and if an HInvoke represents an intrisified method, we utilize this field to determine if the HInvoke needs a dex cache. Bug: 21481923 Change-Id: I9dd25a385e1a1397603da6c4c43f6c1aea511b32
|
7da072feb160079734331e994ea52760cb2a3243 |
|
13-Aug-2015 |
agicsaki <agicsaki@google.com> |
Structure for String.Equals intrinsic Added structure for implementing String.Equals intrinsics. There is no functional change at this point- the intrinsic is marked as unimplemented for all instruction sets and compilers. Bug: 21481923 Change-Id: Ic2a1e22a113ff6091581126f12e926478c011340
|
6cff09a873e0179f2a8d28727d4cd2447bd1bf16 |
|
13-Aug-2015 |
agicsaki <agicsaki@google.com> |
Intrinsics recognizer returns kNone for MIPS, MIPS64 instruction sets Since no intrinsics are implemented in MIPS or MIPS64, the intrinsics recognizer now does not mark methods as being intrinsified if the current instruction set is either MIPS or MIPS64. Change-Id: I9819ccd11d280e548623ad18add057eefefbf6d5
|
57b81ecbe74138992dd447251e94ed06cd5eb802 |
|
12-Aug-2015 |
agicsaki <agicsaki@google.com> |
Add support to indicate whether intrinsics require an environment A structural change to indicate whether a given intrinsic requires access to an environment. I added a field to HInvoke objects to indicate if they need an environment whose default value is true and is only updated if an intrinsic is marked as not requiring an environment. At this point there is no functional change, as all intrinsics are marked as requiring an environment. This change adds the structure for future inliner work which will allow us to inline more intrinsified calls. Change-Id: I2930e3cef7b785384bf95b95a542d34af442f3b9
|
611d3395e9efc0ab8dbfa4a197fa022fbd8c7204 |
|
10-Jul-2015 |
Scott Wakeling <scott.wakeling@linaro.org> |
ARM/ARM64: Implement numberOfLeadingZeros intrinsic. Change-Id: I4042fb7a0b75140475dcfca23e8f79d310f5333b
|
aabdf8ad2e8d3de953dff5c7591e7b3df4d4f60b |
|
03-Aug-2015 |
Roland Levillain <rpl@google.com> |
Revert "Optimizing String.Equals as an intrinsic (x86)" Reverted as it breaks the compilation of boot.{oat,art} on x86 (although this CL may not be the culprit, as the issue seems to come from Optimizing's register allocator). This reverts commit 8ab7bd6c8b10ad58758c33a1dc9326212bd200e9. Change-Id: If7c8b6258d1e690f4d2a06bcc82c92563ac6cdef
|
8ab7bd6c8b10ad58758c33a1dc9326212bd200e9 |
|
27-Jul-2015 |
agicsaki <agicsaki@google.com> |
Optimizing String.Equals as an intrinsic (x86) The third implementation of String.Equals. I added an intrinsic in x86 which is similar to the original java implementation of String.equals: an instanceof check, null check, length check, and reference equality check followed by a loop comparing strings character by character. Interesting Benchmarking Values: Optimizing Compiler on Nexus Player Intrinsic 15-30 Character Strings: 177 ns Original 15-30 Character Strings: 275 ns Intrinsic Null Argument: 59 ns Original Null Argument: 137 ns Intrinsic 100-1000 Character Strings: 1812 ns Original 100-1000 Character Strings: 6334 ns Bug: 21481923 Change-Id: Ia386e19b9dbfe0dac688b20ec93d8f90f67af47e
|
109c89a8e3b5023d123f8c1313f5843a0ba2e15e |
|
31-Jul-2015 |
David Brazdil <dbrazdil@google.com> |
ART: Change stream output kNone intrinsic Name of intrinsics is dumped with C1visualizer and checked with Checker whose attributes should not contain whitespace. This patch changes the output printed for non-intrinsified invokes. Change-Id: I3e565e8c9e26eb61026e7a13823eab20409dd63a
|
41b175aba41c9365a1c53b8a1afbd17129c87c14 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
|
80afd02024d20e60b197d3adfbb43cc303cf29e0 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 Change-Id: I704884dab15b41ecf7a1c47d397ab1c3fc7ee0f7
|
d5111bf05fc0a9974280a80eeb43db6d5227a81e |
|
22-May-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Do not use dex_compilation_unit after inlining. It's incompatible with inlining, as inlined invokes/load class/new can be from another dex file. Change-Id: I8897b6a012942bc8e136f2bea70252d3fb3a7fa5
|
ec525fc30848189051b888da53ba051bc0878b78 |
|
28-Apr-2015 |
Roland Levillain <rpl@google.com> |
Factor MoveArguments methods in Optimizing's intrinsics handlers. Also add a precondition similar to the one present in code generators, regarding static invoke related explicit clinit check elimination in non-baseline compilations. Change-Id: I26f4dcb5d02824d7556f90b4b0c85b08b737fa53
|
848f70a3d73833fc1bf3032a9ff6812e429661d9 |
|
15-Jan-2014 |
Jeff Hao <jeffhao@google.com> |
Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
|
65b798ea10dd716c1bb3dda029f9bf255435af72 |
|
06-Apr-2015 |
Andreas Gampe <agampe@google.com> |
ART: Enable more Clang warnings Change-Id: Ie6aba02f4223b1de02530e1515c63505f37e184c
|
3e90a96f403cbc353731e6687fe12a088f996cee |
|
27-Mar-2015 |
Razvan A Lupusoru <razvan.a.lupusoru@intel.com> |
[optimizing] Do not inline intrinsics The intrinsics generally have specialized code and the code for them may be faster than what can be achieved with inlining. Thus inliner should skip intrinsics. At the same time, easy methods are not worth intrinsifying: ie String length and isEmpty. Those can be handled by inliner with no problem and can actually lead to better code since call is not kept around through all of the optimizations. Change-Id: Iab38e6c33f79efa54d845d4871cf26fa9b235ab0 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
878d58cbaf6b17a9e3dcab790754527f3ebc69e5 |
|
16-Jan-2015 |
Andreas Gampe <agampe@google.com> |
ART: Arm64 optimizing compiler intrinsics Implement most intrinsics for the optimizing compiler for Arm64. Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707
|
71fb52fee246b7d511f520febbd73dc7a9bbca79 |
|
30-Dec-2014 |
Andreas Gampe <agampe@google.com> |
ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807
|