41b175aba41c9365a1c53b8a1afbd17129c87c14 |
|
19-May-2015 |
Vladimir Marko <vmarko@google.com> |
ART: Clean up arm64 kNumberOfXRegisters usage. Avoid undefined behavior for arm64 stemming from 1u << 32 in loops with upper bound kNumberOfXRegisters. Create iterators for enumerating bits in an integer either from high to low or from low to high and use them for <arch>Context::FillCalleeSaves() on all architectures. Refactor runtime/utils.{h,cc} by moving all bit-fiddling functions to runtime/base/bit_utils.{h,cc} (together with the new bit iterators) and all time-related functions to runtime/base/time_utils.{h,cc}. Improve test coverage and fix some corner cases for the bit-fiddling functions. Bug: 13925192 (cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0) Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
|
bf9fc581e8870faddbd320a935f9a627da724c48 |
|
14-Mar-2015 |
Mathieu Chartier <mathieuc@google.com> |
Add more info to who called SuspendAll Helps diagnose related jank. Change-Id: I38191cdda723c6f0355d0197c494a3dff2b6653c
|
4460a84be92b5a94ecfb5c650aef4945ab849c93 |
|
09-Mar-2015 |
Hiroshi Yamauchi <yamauchi@google.com> |
Rosalloc thread local allocation path without a cas. Speedup on N4: MemAllocTest 3044 -> 2396 (~21% reduction) BinaryTrees 4101 -> 2929 (~26% reduction) Bug: 9986565 Change-Id: Ia1d1a37b9e001f903c3c056e8ec68fc8c623a78b
|
70a596d61f8cf5b6447326c46c3386e0fbd5bfb5 |
|
17-Dec-2014 |
Mathieu Chartier <mathieuc@google.com> |
Add thread suspend histogram Helps measure time to suspend. Example output (maps after a few seconds): suspend all histogram: Sum: 2.806ms 99% C.I. 2us-1090.560us Avg: 43.843us Max: 1126us Change-Id: I7bd9dd3b401fb3e3059e8718556d60910e541611
|
cf7f19135f0e273f7b0136315633c2abfc715343 |
|
23-Oct-2014 |
Ian Rogers <irogers@google.com> |
C++11 related clean-up of DISALLOW_.. Move DISALLOW_COPY_AND_ASSIGN to delete functions. By no having declarations with no definitions this prompts better warning messages so deal with these by correcting the code. Add a DISALLOW_ALLOCATION and use for ValueObject and mirror::Object. Make X86 assembly operand types ValueObjects to fix compilation errors. Tidy the use of iostream and ostream. Avoid making cutils a dependency via mutex-inl.h for tests that link against libart. Push tracing dependencies into appropriate files and mutex.cc. x86 32-bit host symbols size is increased for libarttest, avoid copying this in run-test 115 by using symlinks and remove this test's higher than normal ulimit. Fix the RunningOnValgrind test in RosAllocSpace to not use GetHeap as it returns NULL when the heap is under construction by Runtime. Change-Id: Ia246f7ac0c11f73072b30d70566a196e9b78472b
|
c7dd295a4e0cc1d15c0c96088e55a85389bade74 |
|
22-Oct-2014 |
Ian Rogers <irogers@google.com> |
Tidy up logging. Move gVerboseMethods to CompilerOptions. Now "--verbose-methods=" option to dex2oat rather than runtime argument "-verbose-methods:". Move ToStr and Dumpable out of logging.h, move LogMessageData into logging.cc except for a forward declaration. Remove ConstDumpable as Dump methods are all const (and make this so if not currently true). Make LogSeverity an enum and improve compile time assertions and type checking. Remove log_severity.h that's only used in logging.h. With system headers gone from logging.h, go add to .cc files missing system header includes. Also, make operator new in ValueObject private for compile time instantiation checking. Change-Id: I3228f614500ccc9b14b49c72b9821c8b0db3d641
|
376fc3caf0f0b9cb63592ff3bac06420f6b13ba8 |
|
09-Aug-2014 |
Mathieu Chartier <mathieuc@google.com> |
Check pause histogram sample size. There is a race where the GC sees > 0 iterations but 0 pauses. We now check that there is a non zero number of pauses before printing the pause histogram. Bug: 16898792 Change-Id: I87813e5e6f27871ef79f70792925519d112f3534
|
104fa0c0c7dad925d9f4d5c101a8064cd6830da7 |
|
07-Aug-2014 |
Mathieu Chartier <mathieuc@google.com> |
Guard pause histogram with lock. There is a race condition since the GC was updating this without holding any locks. But the signal catcher could be dumping the timings with DumpGcPerformanceInfo at the same time. This could potentially cause out of bound errors, etc.. Also did a bit of cleanup. Bug: 15446488 Change-Id: Icaff19d872cc7f7d31c34e4ddaae97502454e09c
|
f5997b4d3f889569d5a2b724d83d764bfbb8d106 |
|
20-Jun-2014 |
Mathieu Chartier <mathieuc@google.com> |
More advanced timing loggers. The new timing loggers have lower overhead since they only push into a vector. The new format has two types, a start timing and a stop timing. You can thing of these as brackets associated with a timestamp. It uses these to construct various statistics when needed, such as: Total time, exclusive time, and nesting depth. Changed PrettyDuration to have a default of 3 digits after the decimal point. Exaple of a GC dump with exclusive / total times and indenting: I/art (23546): GC iteration timing logger [Exclusive time] [Total time] I/art (23546): 0ms InitializePhase I/art (23546): 0.305ms/167.746ms MarkingPhase I/art (23546): 0ms BindBitmaps I/art (23546): 0ms FindDefaultSpaceBitmap I/art (23546): 0ms/1.709ms ProcessCards I/art (23546): 0.183ms ImageModUnionClearCards I/art (23546): 0.916ms ZygoteModUnionClearCards I/art (23546): 0.610ms AllocSpaceClearCards I/art (23546): 1.373ms AllocSpaceClearCards I/art (23546): 0.305ms/6.318ms MarkRoots I/art (23546): 2.106ms MarkRootsCheckpoint I/art (23546): 0.153ms MarkNonThreadRoots I/art (23546): 4.287ms MarkConcurrentRoots I/art (23546): 43.461ms UpdateAndMarkImageModUnionTable I/art (23546): 0ms/112.712ms RecursiveMark I/art (23546): 112.712ms ProcessMarkStack I/art (23546): 0.610ms/2.777ms PreCleanCards I/art (23546): 0.305ms/0.855ms ProcessCards I/art (23546): 0.153ms ImageModUnionClearCards I/art (23546): 0.610ms ZygoteModUnionClearCards I/art (23546): 0.610ms AllocSpaceClearCards I/art (23546): 0.549ms AllocSpaceClearCards I/art (23546): 0.549ms MarkRootsCheckpoint I/art (23546): 0.610ms MarkNonThreadRoots I/art (23546): 0ms MarkConcurrentRoots I/art (23546): 0.610ms ScanGrayImageSpaceObjects I/art (23546): 0.305ms ScanGrayZygoteSpaceObjects I/art (23546): 0.305ms ScanGrayAllocSpaceObjects I/art (23546): 1.129ms ScanGrayAllocSpaceObjects I/art (23546): 0ms ProcessMarkStack I/art (23546): 0ms/0.977ms (Paused)PausePhase I/art (23546): 0.244ms ReMarkRoots I/art (23546): 0.672ms (Paused)ScanGrayObjects I/art (23546): 0ms (Paused)ProcessMarkStack I/art (23546): 0ms/0.610ms SwapStacks I/art (23546): 0.610ms RevokeAllThreadLocalAllocationStacks I/art (23546): 0ms PreSweepingGcVerification I/art (23546): 0ms/10.621ms ReclaimPhase I/art (23546): 0.610ms/0.702ms ProcessReferences I/art (23546): 0.214ms/0.641ms EnqueueFinalizerReferences I/art (23546): 0.427ms ProcessMarkStack I/art (23546): 0.488ms SweepSystemWeaks I/art (23546): 0.824ms/9.400ms Sweep I/art (23546): 0ms SweepMallocSpace I/art (23546): 0.214ms SweepZygoteSpace I/art (23546): 0.122ms SweepMallocSpace I/art (23546): 6.226ms SweepMallocSpace I/art (23546): 0ms SweepMallocSpace I/art (23546): 2.144ms SweepLargeObjects I/art (23546): 0.305ms SwapBitmaps I/art (23546): 0ms UnBindBitmaps I/art (23546): 0.275ms FinishPhase I/art (23546): GC iteration timing logger: end, 178.971ms Change-Id: Ia55b65609468f212b3cd65cda66b843da42be645
|
10fb83ad7442c8cf3356a89ec918e0786f110981 |
|
16-Jun-2014 |
Mathieu Chartier <mathieuc@google.com> |
Shared single GC iteration accounting for all GCs. Previously, each garbage collector had data that was only used during collection. Since only one collector can be running at any given time, we can make this data be shared between all collectors. This reduces memory usage since we don't need to have redundant information for each GC types. Also reduced how much code is required to sweep spaces. Bug: 9969166 Change-Id: I31caf0ee4d572f75e0c66863fe7db12c08ae08e7
|
19d46b44f2abe742be22e32908dbfd9e6dd9bfea |
|
18-Jun-2014 |
Mathieu Chartier <mathieuc@google.com> |
Fix systrace logging, total paused time, and bytes saved message. Moved the GC top level systrace logging to be inside of Collector::Run. This prevents cases where we forgot to call it such as background compaction. Fixed a unit error regarding total pause time. Fixed negative bytes saved to use the word "expanded". Bug: 15702709 Change-Id: Ic2991ecad2daa000d0aee9d559b8bc77d8c160aa
|
e76e70f424468f311c2061c291e8384263f3968c |
|
03-May-2014 |
Mathieu Chartier <mathieuc@google.com> |
Add RecordFree to the GarbageCollector interface RecordFree now calls the Heap::RecordFree as well as updates the garbage collector's internal bytes freed accounting. Change-Id: I8cb03748b0768e3c8c50ea709572960e6e4ad219
|
6f365cc033654a5a3b45eaa1379d4b5f156b0cee |
|
23-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Enable concurrent sweeping for non-concurrent GC. Refactored the GarbageCollector to let all of the phases be run by the collector's RunPhases virtual method. This lets the GC decide which phases should be concurrent and reduces how much baked in GC logic resides in GarbageCollector. Enabled concurrent sweeping in the semi space and non concurrent mark sweep GCs. Changed the semi-space collector to have a swap semi spaces boolean which can be changed with a setter. Fixed tests to pass with GSS collector, there was an error related to the large object space limit. Before (EvaluateAndApplyChanges): GSS paused GC time 7.81s/7.81s, score: 3920 After (EvaluateAndApplyChanges): GSS paused GC time 6.94s/7.71s, score: 3900 Benchmark score doesn't go up since the GC happens in the allocating thread. There is a slight reduction in pause times experienced by other threads (0.8s total). Added options for pre sweeping GC heap verification and pre sweeping rosalloc verification. Bug: 14226004 Bug: 14250892 Bug: 14386356 Change-Id: Ib557d0590c1ed82a639d0f0281ba67cf8cae938c
|
62ab87bb3ff4830def25a1716f6785256c7eebca |
|
28-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Always log explicit GC. People who use DDMS want to see that a GC actually occurs when they press GC button. Bug: 14325353 Change-Id: I44e0450c92abf7223d33552ed37f626fe63e1c28
|
bbd695c71e0bf518f582e84524e1cdeb3de3896c |
|
16-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Replace ObjectSet with LargeObjectBitmap. Speeds up large object marking since large objects no longer required a lock. Changed the GCs to use the heap bitmap for marking objects which aren't in the fast path. This eliminates the need for a MarkLargeObject function. Maps before (10 GC iterations): Mean partial time: 180ms Mean sticky time: 151ms Maps after: Mean partial time: 161ms Mean sticky time: 101ms Note: the GC durations are long due to recent ergonomic changes and because the fast bulk free hasn't yet been enabled. Over 50% of the GC time is spent in RosAllocSpace::FreeList. Bug: 13571028 Change-Id: Id8f94718aeaa13052672ccbae1e8edf77d653f62
|
a8e8f9c0a8e259a807d7b99a148d14104c24209d |
|
09-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Refactor space bitmap to support different alignments. Required for: Using space bitmaps instead of std::set in mod union table + remembered set. Using a bitmap instead of set for large object marking. Bug: 13571028 Change-Id: Id024e9563d4ca4278f79607cdb2f81895121b113
|
5a48719b516a52d1a6800d8ae6f7dcba3d883bdc |
|
08-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Reset GC timings after SIGQUIT. We now reset the GC timings when a SIGQUIT happens, this is useful for excluding GCs which happen during the initialization of an app when measuring GC performance. Change-Id: I68c79bdb279290c12ae588bc7e95ac24908c157e
|
440e4ceb310349ee8eb569495bc04d3d7fbe71cb |
|
01-Apr-2014 |
Mathieu Chartier <mathieuc@google.com> |
Add monitor deflation. We now deflate the monitors when we perform a heap trim. This causes a pause but it shouldn't matter since we should be in a state where we don't care about pauses. Memory savings are hard to measure. Fixed integer overflow bug in GetEstimatedLastIterationThroughput. Bug: 13733906 Change-Id: I4e0e68add02e7f43370b3a5ea763d6fe8a5b212c
|
0f7bf6a3ad1798fde328a2bff48a4bf2d750a36b |
|
28-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Swap allocation stacks in pause. This enables us to collect objects allocated during the GC for both sticky, partial, and full GC. This also significantly simplifies GC code. No measured performance impact on benchmarks, but this should slightly increase sticky GC throughput. Changed RevokeRosAllocThreadLocalBuffers to happen at most once per GC. Previously it occured twice if pre-cleaning was enabled. Renamed HandleDirtyObjectsPhase to PausePhase and enabled it for non-concurrent GC. This helps reduce duplicated code which was in both HandleDirtyObjectsPhase for concurrent GC and ReclaimPhase for non-concurrent GC. Change-Id: I533414b5c2cd2800f00724418e0ff90e7fdb0252
|
4aeec176eaf11fe03f342aadcbb79142230270ed |
|
28-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Refactor some GC code. Reduced amount of code in mark sweep / semi space by moving common logic to garbage_collector.cc. Cleaned up mod union tables and deleted an unused implementation. Change-Id: I4bcc6ba41afd96d230cfbaf4d6636f37c52e37ea
|
d5307ec41c8344be0c32273ec4f574064036187d |
|
28-Mar-2014 |
Hiroshi Yamauchi <yamauchi@google.com> |
An empty collector skeleton for a read barrier-based collector. Bug: 12687968 Change-Id: Ic2a3a7b9943ca64e7f60f4d6ed552a316ea4a6f3
|
afe4998fc15b8de093d6b282c9782d7182829e36 |
|
27-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Change sticky GC ergonomics to use GC throughput. The old sticky ergonomics used partial/full GC when the bytes until the footprint limit was < min free. This was suboptimal. The new sticky GC ergonomics do partial/full GC when the throughput of the current sticky GC iteration is <= mean throughput of the partial/full GC. Total GC time on FormulaEvaluationActions.EvaluateAndApplyChanges. Before: 26.4s After: 24.8s No benchmark score change measured. Bug: 8788501 Change-Id: I90000305e93fd492a8ef5a06ec9620d830eaf90d
|
c93c530efc175954160c3834c93961a1a946a35a |
|
21-Mar-2014 |
Hiroshi Yamauchi <yamauchi@google.com> |
Revoke rosalloc thread-local buffers at the checkpoint. In the mark sweep collector, rosalloc thread-local buffers were revoked during the pause. Now, they are revoked at the thread checkpoint, as opposed to during the pause, which appears to help reduce the pause time. In Ritz MemAllocTest, the average sticky pause time went down ~20% (925 us -> 724 us). Bug: 13394464 Bug: 9986565 Change-Id: I104992a11b46d59264c0b9aa2db82b1ccf2826bc
|
3e41780cb3bcade3b724908e00443a9caf6977ef |
|
20-Mar-2014 |
Hiroshi Yamauchi <yamauchi@google.com> |
Refactor the garbage collector driver (GarbageCollector::Run). Bug: 12687968 Change-Id: Ifc9ee86249f7938f51495ea1498cf0f7853a27e8
|
d6534315596326f1a65aa2d300144c09205c5122 |
|
10-Mar-2014 |
Mathieu Chartier <mathieuc@google.com> |
Add timing split for RevokeAllThreadLocalBuffers. This is part of the pause and should be accounted for. Change-Id: I3165324de810e8fab02719098977402a18013da1
|
a4adbfd44032d70e166e6f18096bbbed05a990ba |
|
05-Feb-2014 |
Hiroshi Yamauchi <yamauchi@google.com> |
RosAlloc verification. If enabled, RosAlloc verification checks the allocator internal metadata and invariants to detect bugs, heap corruptions, and race conditions. Added runtime options for enabling and disabling it. Enable it for the debug build. Bug: 9986565 Bug: 12592026 Change-Id: I923742b87805ae839f1549d78d0d492733da6a58
|
1f3b5358b28a83f0929bdd8ce738f06908677fb7 |
|
03-Feb-2014 |
Mathieu Chartier <mathieuc@google.com> |
Move SwapBitmaps to ContinuousMemMapAllocSpace. Moved the SwapBitmaps function to ContinuousMemMapAllocSpace since the zygote space uses this function during full GC. Fixed a place where we were casting a ZygoteSpace to a MallocSpace, somehow this didn't cause any issues in non-debug builds. Moved the CollectGarbage in PreZygoteFork before the lock to prevent an occasional lock level violation caused by attempting to enqueue java lang references with the a lock. Bug: 12876255 Change-Id: I77439e46d5b26b37724bdcee3a0948410f1b0eb4
|
6f4ffe41649f1e6381e8cda087ad3749206806e5 |
|
13-Jan-2014 |
Hiroshi Yamauchi <yamauchi@google.com> |
Improve the generational mode. - Turn the compile-time flags for generational mode into a command line flag. - In the generational mode, always collect the whole heap, as opposed to the bump pointer space only, if a collection is an explicit, native allocation-triggered or last attempt one. Change-Id: I7a14a707cc47e6e3aa4a3292db62533409f17563
|
db7f37d57b6ac83abe6815d0cd5c50701b6be821 |
|
10-Jan-2014 |
Mathieu Chartier <mathieuc@google.com> |
Refactor large object sweeping. Moved basic sweeping logic into large_object_space.cc. Renamed SpaceSetMap -> ObjectSet. Change-Id: I938c1f29f69b0682350347da2bd5de021c0e0224
|
e6da9af8dfe0a3e3fbc2be700554f6478380e7b9 |
|
16-Dec-2013 |
Mathieu Chartier <mathieuc@google.com> |
Background compaction support. When the process state changes to a state which does not perceives jank, we copy from the main free-list backed allocation space to the bump pointer space and enable the semispace allocator. When we transition back to foreground, we copy back to a free-list backed space. Create a seperate non-moving space which only holds non-movable objects. This enables us to quickly wipe the current alloc space (DlMalloc / RosAlloc) when we transition to background. Added multiple alloc space support to the sticky mark sweep GC. Added a -XX:BackgroundGC option which lets you specify which GC to use for background apps. Passing in -XX:BackgroundGC=SS makes the heap compact the heap for apps which do not perceive jank. Results: Simple background foreground test: 0. Reboot phone, unlock. 1. Open browser, click on home. 2. Open calculator, click on home. 3. Open calendar, click on home. 4. Open camera, click on home. 5. Open clock, click on home. 6. adb shell dumpsys meminfo PSS Normal ART: Sample 1: 88468 kB: Dalvik 3188 kB: Dalvik Other Sample 2: 81125 kB: Dalvik 3080 kB: Dalvik Other PSS Dalvik: Total PSS by category: Sample 1: 81033 kB: Dalvik 27787 kB: Dalvik Other Sample 2: 81901 kB: Dalvik 28869 kB: Dalvik Other PSS ART + Background Compaction: Sample 1: 71014 kB: Dalvik 1412 kB: Dalvik Other Sample 2: 73859 kB: Dalvik 1400 kB: Dalvik Other Dalvik other reduction can be explained by less deep allocation stacks / less live bitmaps / less dirty cards. TODO improvements: Recycle mem-maps which are unused in the current state. Not hardcode 64 MB capacity of non movable space (avoid returning linear alloc nightmares). Figure out ways to deal with low virtual address memory problems. Bug: 8981901 Change-Id: Ib235d03f45548ffc08a06b8ae57bf5bada49d6f3
|
692fafd9778141fa6ef0048c9569abd7ee0253bf |
|
30-Nov-2013 |
Mathieu Chartier <mathieuc@google.com> |
Thread local bump pointer allocator. Added a thread local allocator to the heap, each thread has three pointers which specify the thread local buffer: start, cur, and end. When the remaining space in the thread local buffer isn't large enough for the allocation, the allocator allocates a new thread local buffer using the bump pointer allocator. The bump pointer space had to be modified to accomodate thread local buffers. These buffers are called "blocks", where a block is a buffer which contains a set of adjacent objects. Blocks aren't necessarily full and may have wasted memory towards the end. Blocks have an 8 byte header which specifies their size and is required for traversing bump pointer spaces. Memory usage is in between full bump pointer and ROSAlloc since madvised memory limits wasted ram to an average of 1/2 page per block. Added a runtime option -XX:UseTLAB which specifies whether or not to use the thread local allocator. Its a NOP if the garbage collector is not the semispace collector. TODO: Smarter block accounting to prevent us reading objects until we either hit the end of the block or GetClass() == null which signifies that the block isn't 100% full. This would provide a slight speedup to BumpPointerSpace::Walk. Timings: -XX:HeapMinFree=4m -XX:HeapMaxFree=8m -Xmx48m ritzperf memalloc: Dalvik -Xgc:concurrent: 11678 Dalvik -Xgc:noconcurrent: 6697 -Xgc:MS: 5978 -Xgc:SS: 4271 -Xgc:CMS: 4150 -Xgc:SS -XX:UseTLAB: 3255 Bug: 9986565 Bug: 12042213 Change-Id: Ib7e1d4b199a8199f3b1de94b0a7b6e1730689cad
|
b2f9936cab87a187f078187c22d9b29d4a188a62 |
|
21-Nov-2013 |
Mathieu Chartier <mathieuc@google.com> |
Add histogram for GC pause times. Printed when you dump the GC performance info. Bug: 10855285 Change-Id: I3bf7f958305f97c52cb31c03bdd6218c321575b9
|
cf58d4adf461eb9b8e84baa8019054c88cd8acc6 |
|
26-Sep-2013 |
Hiroshi Yamauchi <yamauchi@google.com> |
A custom 'runs-of-slots' memory allocator. Bug: 9986565 Change-Id: I0eb73b9458752113f519483616536d219d5f798b
|
590fee9e8972f872301c2d16a575d579ee564bee |
|
13-Sep-2013 |
Mathieu Chartier <mathieuc@google.com> |
Compacting collector. The compacting collector is currently similar to semispace. It works by copying objects back and forth between two bump pointer spaces. There are types of objects which are "non-movable" due to current runtime limitations. These are Classes, Methods, and Fields. Bump pointer spaces are a new type of continuous alloc space which have no lock in the allocation code path. When you allocate from these it uses atomic operations to increase an index. Traversing the objects in the bump pointer space relies on Object::SizeOf matching the allocated size exactly. Runtime changes: JNI::GetArrayElements returns copies objects if you attempt to get the backing data of a movable array. For GetArrayElementsCritical, we return direct backing storage for any types of arrays, but temporarily disable the GC until the critical region is completed. Added a new runtime call called VisitObjects, this is used in place of the old pattern which was flushing the allocation stack and walking the bitmaps. Changed image writer to be compaction safe and use object monitor word for forwarding addresses. Added a bunch of added SIRTs to ClassLinker, MethodLinker, etc.. TODO: Enable switching allocators, compacting on background, etc.. Bug: 8981901 Change-Id: I3c886fd322a6eef2b99388d19a765042ec26ab99
|
a3d2718d1fac53210b2a311b1728409d6c8e7b9d |
|
06-Nov-2013 |
Brian Carlstrom <bdc@google.com> |
Change thread.h to thread-inl.h to pick up missing Thread::Currnet for debug build in master Change-Id: I56a4dd18ec1c212f9dbb73b14c0c0623b23c87bd
|
3f9667022788ba1effcd1e47fc9e3decc4db569d |
|
05-Sep-2013 |
Mathieu Chartier <mathieuc@google.com> |
Add more systrace logging to GC. There was some confusing systrace messages which made it seem like pauses were longer than they actually were due to premption occuring during thread_list->ResumeAll(). Bug: 10612142 Change-Id: I6eeedd1cf85ff38c5b116f15059469db52cbb73b
|
720ef7680573c1afd12f99f02eee3045daee5168 |
|
17-Aug-2013 |
Mathieu Chartier <mathieuc@google.com> |
Fix non concurrent GC ergonomics. If we dont have concurrent GC enabled, we need to force GC for alloc when we hit the maximum allowed footprint so that our heap doesn't keep growing until it hits the growth limit. Refactored a bit of stuff. Change-Id: I8eceac4ef01e969fd286ebde3a735a09d0a6dfc1
|
94c32c5f01c7d44781317bf23933ed0a5bc4b796 |
|
09-Aug-2013 |
Mathieu Chartier <mathieuc@google.com> |
More parallel GC, rewritten parallel mark stack processing. Card scanning may now be done in parallel. This speeds up sticky and reduces pause times for all GC types. Speedup on my mako (ritz perf): Average pause time for sticky GC (~250 samples): Without parallel cards scanning enabled: 2.524904215ms Parallel card scanning (num_gc_threads_): 1.552123552ms Throughput (~250 samples): Sticky GC throughput with parallel card scanning: 69MB/s Sticky GC throughput without parallel card scanning: 51MB/s Rewrote the mark stack processing to be LIFO and use a prefetch queue like the non parallel version. Cleaned up some of the logcat printing for the activity manager process state listening. Added unlikely hints to object scanning since arrays and classes are scanned much less often than normal objects. Fixed a bug where the number of GC threads was clamped to 1 due to a bool instead of a size_t. Fixed a race condition when we added references to the reference queues. Sharded the reference queue lock into one lock for each reference type (weak, soft, phatom, finalizer). Changed timing splits to be different for processing gray objects with and without mutators paused since sticky GC does both. Mask out the class bit when visiting fields as an optimization, this is valid since classes are held live by the class linker. Partially completed: Parallel recursive mark + finger. Bug: 10245302 Bug: 9969166 Bug: 9986532 Bug: 9961698 Change-Id: I142d09718c4609b7c2387cb28f517a6983c73288
|
02e25119b15a6f619f17db99f5d05124a5807ff3 |
|
15-Aug-2013 |
Mathieu Chartier <mathieuc@google.com> |
Fix up TODO: c++0x, update cpplint. Needed to update cpplint to handle const auto. Fixed a few cpplint errors that were being missed before. Replaced most of the TODO c++0x with ranged based loops. Loops which do not have a descriptive container name have a concrete type instead of auto. Change-Id: Id7cc0f27030f56057c544e94277300b3f298c9c5
|
7940e44f4517de5e2634a7e07d58d0fb26160513 |
|
12-Jul-2013 |
Brian Carlstrom <bdc@google.com> |
Create separate Android.mk for main build targets The runtime, compiler, dex2oat, and oatdump now are in seperate trees to prevent dependency creep. They can now be individually built without rebuilding the rest of the art projects. dalvikvm and jdwpspy were already this way. Builds in the art directory should behave as before, building everything including tests. Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81
|