Cross Reference: /art/compiler/optimizing/intrinsics

History log of /art/compiler/optimizing/intrinsics_list.h
Revision	Date	Author	Comments
0e54c0160c84894696c05af6cad9eae3690f9496	04-Mar-2016	Aart Bik <ajcbik@google.com>	Unsafe: Recognize intrinsics for 1.8 java.util.concurrent With unit test. Rationale: Recognizing the 1.8 methods as intrinsics is the first step towards providing efficient implementation on all architectures. Where not implemented (everywhere for now), the methods fall back to the JNI native or reference implementation. NOTE: needs iam's CL first! bug=26264765 Change-Id: Ife65e81689821a16cbcdd2bb2d35641c6de6aeb6
2f9fcc999fab4ba6cd86c30e664325b47b9618e5	02-Mar-2016	Aart Bik <ajcbik@google.com>	Simplified intrinsic macro mechanism. Rationale: Reduces boiler-plate code in all intrinsics code generators. Also, the newly introduced "unreachable" macro provides a static verifier that we do not have unreachable and thus redundant code in the generators. In fact, this change exposes that the MIPS32 and MIPS64 rotation intrinsics (IntegerRotateRight, LongRotateRight, IntegerRotateLeft, LongRotateLeft) are unreachable, since they are handled as HIR constructs for all architectures. Thus the code can be removed. Change-Id: I0309799a0db580232137ded72bb8a7bbd45440a8
2a6aad9d388bd29bff04aeec3eb9429d436d1873	25-Feb-2016	Aart Bik <ajcbik@google.com>	Implement fp to bits methods as intrinsics. Rationale: Better optimization, better performance. Results on libcore benchmark: Most gain is from moving the invariant call out of the loop after we detect everything is a side-effect free intrinsic. But generated code in general case is much cleaner too. Before: timeFloatToIntBits() in 181 ms. timeFloatToRawIntBits() in 35 ms. timeDoubleToLongBits() in 208 ms. timeDoubleToRawLongBits() in 35 ms. After: timeFloatToIntBits() in 36 ms. timeFloatToRawIntBits() in 35 ms. timeDoubleToLongBits() in 35 ms. timeDoubleToRawLongBits() in 34 ms. bug=11548336 Change-Id: I6e001bd3708e800bd75a82b8950fb3a0fc01766e
59c9454b92c2096a30a2bbdffb64edf33dbdd916	25-Jan-2016	Aart Bik <ajcbik@google.com>	Recognize common utilities as intrinsics. Rationale: Recognizing these method calls as intrinsics already has major advantages (compiler knows about no-side-effects/no-throw properties). Next step is, of course, to implement these with native instructions on each architecture. Change-Id: I06fd12973238caec00d67b31b195d7f8807a538e
3f67e692860d281858485d48a4f1f81b907f1444	15-Jan-2016	Aart Bik <ajcbik@google.com>	Implemented BitCount as an intrinsic. With unit test. Rationale: Recognizing this important operation as an intrinsic has various advantages: (1) having the no-side-effects/no-throw allows for much more GVN/LICM/BCE. (2) Some architectures, like x86_64, provide direct support for this operation. Performance improvements on X86_64: CheckersEvalBench (32-bit bitboard): 27,210KNS -> 36,798KNS = + 35% ReversiEvalBench (64-bit bitboard): 52,562KNS -> 89,086KNS = + 69% Change-Id: I65d549b0469b7909b12c6611cdc34a8640a5751f
5d75afe333f57546786686d9bee16b52f1bbe971	14-Dec-2015	Aart Bik <ajcbik@google.com>	Improved side-effects/can-throw information on intrinsics. Rationale: improved side effect and exception analysis gives many more opportunities for GVN/LICM/BCE. Change-Id: I8aa9b757d77c7bd9d58271204a657c2c525195b5
a4f1220c1518074db18ca1044e9201492975750b	06-Aug-2015	Mark Mendell <mark.p.mendell@intel.com>	Optimizing: Add direct calls to math intrinsics Support the double forms of: cos, sin, acos, asin, atan, atan2, cbrt, cosh, exp, expm1, hypot, log, log10, nextAfter, sinh, tan, tanh Add these entries to the vector addressed off the thread pointer. Call the libc routines directly, which means that we have to implement the native ABI, not the ART one. For x86_64, that includes saving XMM12-15 as the native ABI considers them caller-save, while the ART ABI considers them callee-save. We save them by marking them as used by the call to the math function. For x86, this is not an issue, as all the XMM registers are caller-save. Other architectures will call Java as before until they are ready to implement the new intrinsics. Bump the OAT version since we are incompatible with old boot.oat files. Change-Id: Ic6332c3555c09393a17d1ad4daf62932488722fb Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
ee3cf0731d0ef0787bc2947c8e3ca432b513956b	06-Oct-2015	Nicolas Geoffray <ngeoffray@google.com>	Intrinsify System.arraycopy. Currently on x64, will do the other architectures in different changes. Change-Id: I15fbbadb450dd21787809759a8b14b21b1e42624
9ee23f4273efed8d6378f6ad8e63c65e30a17139	23-Jul-2015	Scott Wakeling <scott.wakeling@linaro.org>	ARM/ARM64: Intrinsics - numberOfTrailingZeros, rotateLeft, rotateRight Change-Id: I2a07c279756ee804fb7c129416bdc4a3962e93ed
05f2056b4f11e0b2bac92b2655abe7030771f5dc	19-Aug-2015	Agi Csaki <agicsaki@google.com>	Add support to indicate whether intrinsics require a dex cache A structural change to indicate whether a given intrinsic requires access to a dex cache. I updated the needs_environment_ field to indicate whether an HInvoke needs an environment or a dex cache, and if an HInvoke represents an intrisified method, we utilize this field to determine if the HInvoke needs a dex cache. Bug: 21481923 Change-Id: I9dd25a385e1a1397603da6c4c43f6c1aea511b32
7da072feb160079734331e994ea52760cb2a3243	13-Aug-2015	agicsaki <agicsaki@google.com>	Structure for String.Equals intrinsic Added structure for implementing String.Equals intrinsics. There is no functional change at this point- the intrinsic is marked as unimplemented for all instruction sets and compilers. Bug: 21481923 Change-Id: Ic2a1e22a113ff6091581126f12e926478c011340
57b81ecbe74138992dd447251e94ed06cd5eb802	12-Aug-2015	agicsaki <agicsaki@google.com>	Add support to indicate whether intrinsics require an environment A structural change to indicate whether a given intrinsic requires access to an environment. I added a field to HInvoke objects to indicate if they need an environment whose default value is true and is only updated if an intrinsic is marked as not requiring an environment. At this point there is no functional change, as all intrinsics are marked as requiring an environment. This change adds the structure for future inliner work which will allow us to inline more intrinsified calls. Change-Id: I2930e3cef7b785384bf95b95a542d34af442f3b9
611d3395e9efc0ab8dbfa4a197fa022fbd8c7204	10-Jul-2015	Scott Wakeling <scott.wakeling@linaro.org>	ARM/ARM64: Implement numberOfLeadingZeros intrinsic. Change-Id: I4042fb7a0b75140475dcfca23e8f79d310f5333b
aabdf8ad2e8d3de953dff5c7591e7b3df4d4f60b	03-Aug-2015	Roland Levillain <rpl@google.com>	Revert "Optimizing String.Equals as an intrinsic (x86)" Reverted as it breaks the compilation of boot.{oat,art} on x86 (although this CL may not be the culprit, as the issue seems to come from Optimizing's register allocator). This reverts commit 8ab7bd6c8b10ad58758c33a1dc9326212bd200e9. Change-Id: If7c8b6258d1e690f4d2a06bcc82c92563ac6cdef
8ab7bd6c8b10ad58758c33a1dc9326212bd200e9	27-Jul-2015	agicsaki <agicsaki@google.com>	Optimizing String.Equals as an intrinsic (x86) The third implementation of String.Equals. I added an intrinsic in x86 which is similar to the original java implementation of String.equals: an instanceof check, null check, length check, and reference equality check followed by a loop comparing strings character by character. Interesting Benchmarking Values: Optimizing Compiler on Nexus Player Intrinsic 15-30 Character Strings: 177 ns Original 15-30 Character Strings: 275 ns Intrinsic Null Argument: 59 ns Original Null Argument: 137 ns Intrinsic 100-1000 Character Strings: 1812 ns Original 100-1000 Character Strings: 6334 ns Bug: 21481923 Change-Id: Ia386e19b9dbfe0dac688b20ec93d8f90f67af47e
848f70a3d73833fc1bf3032a9ff6812e429661d9	15-Jan-2014	Jeff Hao <jeffhao@google.com>	Replace String CharArray with internal uint16_t array. Summary of high level changes: - Adds compiler inliner support to identify string init methods - Adds compiler support (quick & optimizing) with new invoke code path that calls method off the thread pointer - Adds thread entrypoints for all string init methods - Adds map to verifier to log when receiver of string init has been copied to other registers. used by compiler and interpreter Change-Id: I797b992a8feb566f9ad73060011ab6f51eb7ce01
3e90a96f403cbc353731e6687fe12a088f996cee	27-Mar-2015	Razvan A Lupusoru <razvan.a.lupusoru@intel.com>	[optimizing] Do not inline intrinsics The intrinsics generally have specialized code and the code for them may be faster than what can be achieved with inlining. Thus inliner should skip intrinsics. At the same time, easy methods are not worth intrinsifying: ie String length and isEmpty. Those can be handled by inliner with no problem and can actually lead to better code since call is not kept around through all of the optimizations. Change-Id: Iab38e6c33f79efa54d845d4871cf26fa9b235ab0 Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
878d58cbaf6b17a9e3dcab790754527f3ebc69e5	16-Jan-2015	Andreas Gampe <agampe@google.com>	ART: Arm64 optimizing compiler intrinsics Implement most intrinsics for the optimizing compiler for Arm64. Change-Id: Idb459be09f0524cb9aeab7a5c7fccb1c6b65a707
71fb52fee246b7d511f520febbd73dc7a9bbca79	30-Dec-2014	Andreas Gampe <agampe@google.com>	ART: Optimizing compiler intrinsics Add intrinsics infrastructure to the optimizing compiler. Add almost all intrinsics supported by Quick to the x86-64 backend. Further intrinsics require more assembler support. Change-Id: I48de9b44c82886bb298d16e74e12a9506b8e8807