Cross Reference: /libcore/libart/src/main/java/java/lang/StringFactory.java

History log of /libcore/libart/src/main/java/java/lang/StringFactory.java
Revision	Date	Author	Comments
2259a71ba68c8b2d0d74396277fd25cdc0692350	22-Dec-2017	Victor Chang <vichang@google.com>	Make the Android fast-path UTF-8 decoder follow the Unicode Standard and the W3C Encoding standard. The behavior of UTF-8 decoder in the RI has been made to strictly follow the Unicode standard since OpenJDK 8. JDK-7096080 Essentially, it rejects 1. 3-byte surrogate/6-byte surrogate pair (CESU-8 sequence) 2. treats an ill-formed sequence, e.g. a surrogate, as individual ill-formed bytes. This change updates Android's fast-path UTF-8 decoder to - follow the Unicode standard - have a behavior closer to RI OpenJDK 8 - have consistent behavior between java.nio.charset.CharsetDecoder and fast-path code It implements the W3C recommended UTF-8 decoder. https://www.w3.org/TR/encoding/#utf-8-decoder Behavior change of the fast-path UTF-8 decoder - No longer behaves like a decoder for Modified UTF-8 and CESU-8 sequence -- If an app needs to decode a Modified UTF-8 / CESU-8 sequence, the app can use public API DataInputStream.readUTF or JNI function NewStringUTF See example at StringTest.decodeModifiedUTF8 - Treat overlong sequence as ill-formed. For example, byte sequence "c0 b1" is over-long form of character '1' U+0031. - Treat surrogate (U+D800..U+DFFF) as ill-formed - Maximal subpart should be replaced by a single U+FFFD. For example, in byte sequence "41 C0 AF 41 F4 80 80 41", the maximal subparts are "C0", "AF", and "F4 80 80". "F4 80 80" can be the initial subsequence of "F4 80 80 80", but "C0 AF" can't be the initial subsequence of any well-formed code unit sequence. Thus, the output should be "A\ufffd\ufffdA\ufffdA". Test change: - CharsetEncoder2Test.testUtf8Encoding: UTF-8 encoded Surrogate is treated as invalid - X500PrincipalTest.testValidDN: Overlong sequence is now treated as invalid. According to my test, Android Conscrypt (and BoringSSL) has rejected a certificate with such overlong sequence in CN since OC MR1. Thus, it has little use case to create X500Principal with overlong UTF-8 sequence. Also, RI doesn't pass this test either. Context: From my understanding, certificate and X500 principal are stored in ASN.1 format. The RFC standards quoted in X500Principal don't prohibit overlong UTF-8 sequences. But the new standards RFC5280 for X.509 and RFC3629 for UTF-8 explicitly prohibits any overlong UTF-8 sequences. Performance change: The performance of the fast-path decoder is similar before and after the change. === Before the change === CharsetBenchmark Experiment {instrument=runtime, benchmarkMethod=time_new_String_BString, vm=default, parameters={length=10000, name=UTF-8}} Results: runtime(ns): min=574795.84, 1st qu.=574795.84, median=574795.84, mean=574795.84, 3rd qu.=574795.84, max=574795.84 Trial Report (1 of 4): CharsetUtf8Benchmark Experiment {instrument=runtime, benchmarkMethod=time_ascii, vm=default, parameters={}} Results: runtime(ns): min=58290943.00, 1st qu.=58290943.00, median=58290943.00, mean=58290943.00, 3rd qu.=58290943.00, max=58290943.00 Trial Report (2 of 4): Experiment {instrument=runtime, benchmarkMethod=time_bmp2, vm=default, parameters={}} Results: runtime(ns): min=77581414.00, 1st qu.=77581414.00, median=77581414.00, mean=77581414.00, 3rd qu.=77581414.00, max=77581414.00 Trial Report (3 of 4): Experiment {instrument=runtime, benchmarkMethod=time_bmp3, vm=default, parameters={}} Results: runtime(ns): min=57457297.00, 1st qu.=57457297.00, median=57457297.00, mean=57457297.00, 3rd qu.=57457297.00, max=57457297.00 Trial Report (4 of 4): Experiment {instrument=runtime, benchmarkMethod=time_supplementary, vm=default, parameters={}} Results: runtime(ns): min=60723183.00, 1st qu.=60723183.00, median=60723183.00, mean=60723183.00, 3rd qu.=60723183.00, max=60723183.00 === After the change === CharsetBenchmark Experiment {instrument=runtime, benchmarkMethod=time_new_String_BString, vm=default, parameters={length=10000, name=UTF-8}} Results: runtime(ns): min=523638.25, 1st qu.=523638.25, median=523638.25, mean=523638.25, 3rd qu.=523638.25, max=523638.25 CharsetUtf8Benchmark Trial Report (1 of 4): Experiment {instrument=runtime, benchmarkMethod=time_ascii, vm=default, parameters={}} Results: runtime(ns): min=57101725.00, 1st qu.=57101725.00, median=57101725.00, mean=57101725.00, 3rd qu.=57101725.00, max=57101725.00 Trial Report (2 of 4): Experiment {instrument=runtime, benchmarkMethod=time_bmp2, vm=default, parameters={}} Results: runtime(ns): min=76573080.00, 1st qu.=76573080.00, median=76573080.00, mean=76573080.00, 3rd qu.=76573080.00, max=76573080.00 Trial Report (3 of 4): Experiment {instrument=runtime, benchmarkMethod=time_bmp3, vm=default, parameters={}} Results: runtime(ns): min=59655214.00, 1st qu.=59655214.00, median=59655214.00, mean=59655214.00, 3rd qu.=59655214.00, max=59655214.00 Trial Report (4 of 4): Experiment {instrument=runtime, benchmarkMethod=time_supplementary, vm=default, parameters={}} Results: runtime(ns): min=67283548.00, 1st qu.=67283548.00, median=67283548.00, mean=67283548.00, 3rd qu.=67283548.00, max=67283548.00 Test: cts-tradefed run cts-dev -m CtsLibcoreTestCases Test: cts-tradefed run cts-dev -m CtsLibcoreOjTestCases Bug: 69599767 Bug: 70511691 Change-Id: I2c3e84808b19c969905813f6654ba552b6745354
fa5b565a3f6c6d7cbd6106ee8d360304c3a939a3	17-Feb-2017	Igor Murashkin <iam@google.com>	jni: Switch to @FastNative for all JNI functions. Switches all (248) methods that previously used !bang JNI in art/libcore to all use @FastNative. As a nice benefit, this should be about 1.5x faster than before for those method calls. This measures out to a 3% startup time improvement for system_server. Test: make test-art-host Bug: 34955272 Change-Id: I0881f401c7660c79f275235362777bfa58241deb
04b80a24b7d6c36367529c606ca28b9a3729eeb6	29-Feb-2016	Roland Levillain <rpl@google.com>	Improve documentation about StringFactory.newStringFromChars. Make it clear that the native method requires its third argument to be non-null. Bug: 27378573 Change-Id: I4c42d5cb8f8f4ff20c42dbdf1e600f40be10607a
9c733c7f353c888889651beae9f30c7c42d73e05	26-Feb-2016	Roland Levillain <rpl@google.com>	Fix a typo in a comment. Change-Id: I8969820fb38776422d6d00b7f8c0f1f658ec4591
83c7414449bc406b581f0cb81ae06e7bce91403c	15-Jan-2014	Jeff Hao <jeffhao@google.com>	Removed offset and value from String and added StringFactory. Change-Id: I55314ceb906d0bf7e78545dcd9bc3489a5baf03f