History log of /art/runtime/gc/collector/garbage_collector.cc
Revision Date Author Comments
41b175aba41c9365a1c53b8a1afbd17129c87c14 19-May-2015 Vladimir Marko <vmarko@google.com> ART: Clean up arm64 kNumberOfXRegisters usage.

Avoid undefined behavior for arm64 stemming from 1u << 32 in
loops with upper bound kNumberOfXRegisters.

Create iterators for enumerating bits in an integer either
from high to low or from low to high and use them for
<arch>Context::FillCalleeSaves() on all architectures.

Refactor runtime/utils.{h,cc} by moving all bit-fiddling
functions to runtime/base/bit_utils.{h,cc} (together with
the new bit iterators) and all time-related functions to
runtime/base/time_utils.{h,cc}. Improve test coverage and
fix some corner cases for the bit-fiddling functions.

Bug: 13925192

(cherry picked from commit 80afd02024d20e60b197d3adfbb43cc303cf29e0)

Change-Id: I905257a21de90b5860ebe1e39563758f721eab82
bf9fc581e8870faddbd320a935f9a627da724c48 14-Mar-2015 Mathieu Chartier <mathieuc@google.com> Add more info to who called SuspendAll

Helps diagnose related jank.

Change-Id: I38191cdda723c6f0355d0197c494a3dff2b6653c
4460a84be92b5a94ecfb5c650aef4945ab849c93 09-Mar-2015 Hiroshi Yamauchi <yamauchi@google.com> Rosalloc thread local allocation path without a cas.

Speedup on N4:
MemAllocTest 3044 -> 2396 (~21% reduction)
BinaryTrees 4101 -> 2929 (~26% reduction)

Bug: 9986565
Change-Id: Ia1d1a37b9e001f903c3c056e8ec68fc8c623a78b
70a596d61f8cf5b6447326c46c3386e0fbd5bfb5 17-Dec-2014 Mathieu Chartier <mathieuc@google.com> Add thread suspend histogram

Helps measure time to suspend.
Example output (maps after a few seconds):
suspend all histogram: Sum: 2.806ms 99% C.I. 2us-1090.560us Avg: 43.843us Max: 1126us

Change-Id: I7bd9dd3b401fb3e3059e8718556d60910e541611
cf7f19135f0e273f7b0136315633c2abfc715343 23-Oct-2014 Ian Rogers <irogers@google.com> C++11 related clean-up of DISALLOW_..

Move DISALLOW_COPY_AND_ASSIGN to delete functions. By no having declarations
with no definitions this prompts better warning messages so deal with these
by correcting the code.
Add a DISALLOW_ALLOCATION and use for ValueObject and mirror::Object.
Make X86 assembly operand types ValueObjects to fix compilation errors.
Tidy the use of iostream and ostream.
Avoid making cutils a dependency via mutex-inl.h for tests that link against
libart. Push tracing dependencies into appropriate files and mutex.cc.
x86 32-bit host symbols size is increased for libarttest, avoid copying this
in run-test 115 by using symlinks and remove this test's higher than normal
ulimit.
Fix the RunningOnValgrind test in RosAllocSpace to not use GetHeap as it
returns NULL when the heap is under construction by Runtime.

Change-Id: Ia246f7ac0c11f73072b30d70566a196e9b78472b
c7dd295a4e0cc1d15c0c96088e55a85389bade74 22-Oct-2014 Ian Rogers <irogers@google.com> Tidy up logging.

Move gVerboseMethods to CompilerOptions. Now "--verbose-methods=" option to
dex2oat rather than runtime argument "-verbose-methods:".
Move ToStr and Dumpable out of logging.h, move LogMessageData into logging.cc
except for a forward declaration.
Remove ConstDumpable as Dump methods are all const (and make this so if not
currently true).
Make LogSeverity an enum and improve compile time assertions and type checking.
Remove log_severity.h that's only used in logging.h.
With system headers gone from logging.h, go add to .cc files missing system
header includes.
Also, make operator new in ValueObject private for compile time instantiation
checking.

Change-Id: I3228f614500ccc9b14b49c72b9821c8b0db3d641
376fc3caf0f0b9cb63592ff3bac06420f6b13ba8 09-Aug-2014 Mathieu Chartier <mathieuc@google.com> Check pause histogram sample size.

There is a race where the GC sees > 0 iterations but 0 pauses.
We now check that there is a non zero number of pauses before
printing the pause histogram.

Bug: 16898792
Change-Id: I87813e5e6f27871ef79f70792925519d112f3534
104fa0c0c7dad925d9f4d5c101a8064cd6830da7 07-Aug-2014 Mathieu Chartier <mathieuc@google.com> Guard pause histogram with lock.

There is a race condition since the GC was updating this without
holding any locks. But the signal catcher could be dumping the
timings with DumpGcPerformanceInfo at the same time. This could
potentially cause out of bound errors, etc..

Also did a bit of cleanup.

Bug: 15446488
Change-Id: Icaff19d872cc7f7d31c34e4ddaae97502454e09c
f5997b4d3f889569d5a2b724d83d764bfbb8d106 20-Jun-2014 Mathieu Chartier <mathieuc@google.com> More advanced timing loggers.

The new timing loggers have lower overhead since they only push into
a vector. The new format has two types, a start timing and a stop
timing. You can thing of these as brackets associated with a
timestamp. It uses these to construct various statistics when needed,
such as: Total time, exclusive time, and nesting depth.

Changed PrettyDuration to have a default of 3 digits after the decimal
point.

Exaple of a GC dump with exclusive / total times and indenting:
I/art (23546): GC iteration timing logger [Exclusive time] [Total time]
I/art (23546): 0ms InitializePhase
I/art (23546): 0.305ms/167.746ms MarkingPhase
I/art (23546): 0ms BindBitmaps
I/art (23546): 0ms FindDefaultSpaceBitmap
I/art (23546): 0ms/1.709ms ProcessCards
I/art (23546): 0.183ms ImageModUnionClearCards
I/art (23546): 0.916ms ZygoteModUnionClearCards
I/art (23546): 0.610ms AllocSpaceClearCards
I/art (23546): 1.373ms AllocSpaceClearCards
I/art (23546): 0.305ms/6.318ms MarkRoots
I/art (23546): 2.106ms MarkRootsCheckpoint
I/art (23546): 0.153ms MarkNonThreadRoots
I/art (23546): 4.287ms MarkConcurrentRoots
I/art (23546): 43.461ms UpdateAndMarkImageModUnionTable
I/art (23546): 0ms/112.712ms RecursiveMark
I/art (23546): 112.712ms ProcessMarkStack
I/art (23546): 0.610ms/2.777ms PreCleanCards
I/art (23546): 0.305ms/0.855ms ProcessCards
I/art (23546): 0.153ms ImageModUnionClearCards
I/art (23546): 0.610ms ZygoteModUnionClearCards
I/art (23546): 0.610ms AllocSpaceClearCards
I/art (23546): 0.549ms AllocSpaceClearCards
I/art (23546): 0.549ms MarkRootsCheckpoint
I/art (23546): 0.610ms MarkNonThreadRoots
I/art (23546): 0ms MarkConcurrentRoots
I/art (23546): 0.610ms ScanGrayImageSpaceObjects
I/art (23546): 0.305ms ScanGrayZygoteSpaceObjects
I/art (23546): 0.305ms ScanGrayAllocSpaceObjects
I/art (23546): 1.129ms ScanGrayAllocSpaceObjects
I/art (23546): 0ms ProcessMarkStack
I/art (23546): 0ms/0.977ms (Paused)PausePhase
I/art (23546): 0.244ms ReMarkRoots
I/art (23546): 0.672ms (Paused)ScanGrayObjects
I/art (23546): 0ms (Paused)ProcessMarkStack
I/art (23546): 0ms/0.610ms SwapStacks
I/art (23546): 0.610ms RevokeAllThreadLocalAllocationStacks
I/art (23546): 0ms PreSweepingGcVerification
I/art (23546): 0ms/10.621ms ReclaimPhase
I/art (23546): 0.610ms/0.702ms ProcessReferences
I/art (23546): 0.214ms/0.641ms EnqueueFinalizerReferences
I/art (23546): 0.427ms ProcessMarkStack
I/art (23546): 0.488ms SweepSystemWeaks
I/art (23546): 0.824ms/9.400ms Sweep
I/art (23546): 0ms SweepMallocSpace
I/art (23546): 0.214ms SweepZygoteSpace
I/art (23546): 0.122ms SweepMallocSpace
I/art (23546): 6.226ms SweepMallocSpace
I/art (23546): 0ms SweepMallocSpace
I/art (23546): 2.144ms SweepLargeObjects
I/art (23546): 0.305ms SwapBitmaps
I/art (23546): 0ms UnBindBitmaps
I/art (23546): 0.275ms FinishPhase
I/art (23546): GC iteration timing logger: end, 178.971ms

Change-Id: Ia55b65609468f212b3cd65cda66b843da42be645
10fb83ad7442c8cf3356a89ec918e0786f110981 16-Jun-2014 Mathieu Chartier <mathieuc@google.com> Shared single GC iteration accounting for all GCs.

Previously, each garbage collector had data that was only used
during collection. Since only one collector can be running at any
given time, we can make this data be shared between all collectors.
This reduces memory usage since we don't need to have redundant
information for each GC types. Also reduced how much code is required
to sweep spaces.

Bug: 9969166
Change-Id: I31caf0ee4d572f75e0c66863fe7db12c08ae08e7
19d46b44f2abe742be22e32908dbfd9e6dd9bfea 18-Jun-2014 Mathieu Chartier <mathieuc@google.com> Fix systrace logging, total paused time, and bytes saved message.

Moved the GC top level systrace logging to be inside of Collector::Run.
This prevents cases where we forgot to call it such as background
compaction. Fixed a unit error regarding total pause time. Fixed
negative bytes saved to use the word "expanded".

Bug: 15702709

Change-Id: Ic2991ecad2daa000d0aee9d559b8bc77d8c160aa
e76e70f424468f311c2061c291e8384263f3968c 03-May-2014 Mathieu Chartier <mathieuc@google.com> Add RecordFree to the GarbageCollector interface

RecordFree now calls the Heap::RecordFree as well as updates the
garbage collector's internal bytes freed accounting.

Change-Id: I8cb03748b0768e3c8c50ea709572960e6e4ad219
6f365cc033654a5a3b45eaa1379d4b5f156b0cee 23-Apr-2014 Mathieu Chartier <mathieuc@google.com> Enable concurrent sweeping for non-concurrent GC.

Refactored the GarbageCollector to let all of the phases be run by
the collector's RunPhases virtual method. This lets the GC decide
which phases should be concurrent and reduces how much baked in GC
logic resides in GarbageCollector.

Enabled concurrent sweeping in the semi space and non concurrent
mark sweep GCs. Changed the semi-space collector to have a swap semi
spaces boolean which can be changed with a setter.

Fixed tests to pass with GSS collector, there was an error related to
the large object space limit.

Before (EvaluateAndApplyChanges):
GSS paused GC time 7.81s/7.81s, score: 3920

After (EvaluateAndApplyChanges):
GSS paused GC time 6.94s/7.71s, score: 3900

Benchmark score doesn't go up since the GC happens in the allocating
thread. There is a slight reduction in pause times experienced by
other threads (0.8s total).

Added options for pre sweeping GC heap verification and pre sweeping
rosalloc verification.

Bug: 14226004
Bug: 14250892
Bug: 14386356

Change-Id: Ib557d0590c1ed82a639d0f0281ba67cf8cae938c
62ab87bb3ff4830def25a1716f6785256c7eebca 28-Apr-2014 Mathieu Chartier <mathieuc@google.com> Always log explicit GC.

People who use DDMS want to see that a GC actually occurs when they
press GC button.

Bug: 14325353
Change-Id: I44e0450c92abf7223d33552ed37f626fe63e1c28
bbd695c71e0bf518f582e84524e1cdeb3de3896c 16-Apr-2014 Mathieu Chartier <mathieuc@google.com> Replace ObjectSet with LargeObjectBitmap.

Speeds up large object marking since large objects no longer required
a lock. Changed the GCs to use the heap bitmap for marking objects
which aren't in the fast path. This eliminates the need for a
MarkLargeObject function.

Maps before (10 GC iterations):
Mean partial time: 180ms
Mean sticky time: 151ms

Maps after:
Mean partial time: 161ms
Mean sticky time: 101ms

Note: the GC durations are long due to recent ergonomic changes and
because the fast bulk free hasn't yet been enabled. Over 50% of the
GC time is spent in RosAllocSpace::FreeList.

Bug: 13571028

Change-Id: Id8f94718aeaa13052672ccbae1e8edf77d653f62
a8e8f9c0a8e259a807d7b99a148d14104c24209d 09-Apr-2014 Mathieu Chartier <mathieuc@google.com> Refactor space bitmap to support different alignments.

Required for:
Using space bitmaps instead of std::set in mod union table +
remembered set.
Using a bitmap instead of set for large object marking.

Bug: 13571028

Change-Id: Id024e9563d4ca4278f79607cdb2f81895121b113
5a48719b516a52d1a6800d8ae6f7dcba3d883bdc 08-Apr-2014 Mathieu Chartier <mathieuc@google.com> Reset GC timings after SIGQUIT.

We now reset the GC timings when a SIGQUIT happens, this is useful
for excluding GCs which happen during the initialization of an app
when measuring GC performance.

Change-Id: I68c79bdb279290c12ae588bc7e95ac24908c157e
440e4ceb310349ee8eb569495bc04d3d7fbe71cb 01-Apr-2014 Mathieu Chartier <mathieuc@google.com> Add monitor deflation.

We now deflate the monitors when we perform a heap trim. This causes
a pause but it shouldn't matter since we should be in a state where
we don't care about pauses. Memory savings are hard to measure.

Fixed integer overflow bug in GetEstimatedLastIterationThroughput.

Bug: 13733906
Change-Id: I4e0e68add02e7f43370b3a5ea763d6fe8a5b212c
0f7bf6a3ad1798fde328a2bff48a4bf2d750a36b 28-Mar-2014 Mathieu Chartier <mathieuc@google.com> Swap allocation stacks in pause.

This enables us to collect objects allocated during the GC for both
sticky, partial, and full GC. This also significantly simplifies GC
code. No measured performance impact on benchmarks, but this should
slightly increase sticky GC throughput.

Changed RevokeRosAllocThreadLocalBuffers to happen at most once per
GC. Previously it occured twice if pre-cleaning was enabled.

Renamed HandleDirtyObjectsPhase to PausePhase and enabled it for
non-concurrent GC. This helps reduce duplicated code which was in
both HandleDirtyObjectsPhase for concurrent GC and ReclaimPhase for
non-concurrent GC.

Change-Id: I533414b5c2cd2800f00724418e0ff90e7fdb0252
4aeec176eaf11fe03f342aadcbb79142230270ed 28-Mar-2014 Mathieu Chartier <mathieuc@google.com> Refactor some GC code.

Reduced amount of code in mark sweep / semi space by moving
common logic to garbage_collector.cc. Cleaned up mod union tables
and deleted an unused implementation.

Change-Id: I4bcc6ba41afd96d230cfbaf4d6636f37c52e37ea
d5307ec41c8344be0c32273ec4f574064036187d 28-Mar-2014 Hiroshi Yamauchi <yamauchi@google.com> An empty collector skeleton for a read barrier-based collector.

Bug: 12687968

Change-Id: Ic2a3a7b9943ca64e7f60f4d6ed552a316ea4a6f3
afe4998fc15b8de093d6b282c9782d7182829e36 27-Mar-2014 Mathieu Chartier <mathieuc@google.com> Change sticky GC ergonomics to use GC throughput.

The old sticky ergonomics used partial/full GC when the bytes until
the footprint limit was < min free. This was suboptimal. The new
sticky GC ergonomics do partial/full GC when the throughput
of the current sticky GC iteration is <= mean throughput of the
partial/full GC.

Total GC time on FormulaEvaluationActions.EvaluateAndApplyChanges.
Before: 26.4s
After: 24.8s
No benchmark score change measured.

Bug: 8788501

Change-Id: I90000305e93fd492a8ef5a06ec9620d830eaf90d
c93c530efc175954160c3834c93961a1a946a35a 21-Mar-2014 Hiroshi Yamauchi <yamauchi@google.com> Revoke rosalloc thread-local buffers at the checkpoint.

In the mark sweep collector, rosalloc thread-local buffers were
revoked during the pause. Now, they are revoked at the thread
checkpoint, as opposed to during the pause, which appears to help
reduce the pause time.

In Ritz MemAllocTest, the average sticky pause time went down ~20%
(925 us -> 724 us).

Bug: 13394464
Bug: 9986565
Change-Id: I104992a11b46d59264c0b9aa2db82b1ccf2826bc
3e41780cb3bcade3b724908e00443a9caf6977ef 20-Mar-2014 Hiroshi Yamauchi <yamauchi@google.com> Refactor the garbage collector driver (GarbageCollector::Run).

Bug: 12687968

Change-Id: Ifc9ee86249f7938f51495ea1498cf0f7853a27e8
d6534315596326f1a65aa2d300144c09205c5122 10-Mar-2014 Mathieu Chartier <mathieuc@google.com> Add timing split for RevokeAllThreadLocalBuffers.

This is part of the pause and should be accounted for.

Change-Id: I3165324de810e8fab02719098977402a18013da1
a4adbfd44032d70e166e6f18096bbbed05a990ba 05-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> RosAlloc verification.

If enabled, RosAlloc verification checks the allocator internal
metadata and invariants to detect bugs, heap corruptions, and race
conditions. Added runtime options for enabling and disabling
it. Enable it for the debug build.

Bug: 9986565
Bug: 12592026
Change-Id: I923742b87805ae839f1549d78d0d492733da6a58
1f3b5358b28a83f0929bdd8ce738f06908677fb7 03-Feb-2014 Mathieu Chartier <mathieuc@google.com> Move SwapBitmaps to ContinuousMemMapAllocSpace.

Moved the SwapBitmaps function to ContinuousMemMapAllocSpace since
the zygote space uses this function during full GC.

Fixed a place where we were casting a ZygoteSpace to a MallocSpace,
somehow this didn't cause any issues in non-debug builds.

Moved the CollectGarbage in PreZygoteFork before the lock to prevent
an occasional lock level violation caused by attempting to enqueue
java lang references with the a lock.

Bug: 12876255

Change-Id: I77439e46d5b26b37724bdcee3a0948410f1b0eb4
6f4ffe41649f1e6381e8cda087ad3749206806e5 13-Jan-2014 Hiroshi Yamauchi <yamauchi@google.com> Improve the generational mode.

- Turn the compile-time flags for generational mode into a command
line flag.

- In the generational mode, always collect the whole heap, as opposed
to the bump pointer space only, if a collection is an explicit,
native allocation-triggered or last attempt one.

Change-Id: I7a14a707cc47e6e3aa4a3292db62533409f17563
db7f37d57b6ac83abe6815d0cd5c50701b6be821 10-Jan-2014 Mathieu Chartier <mathieuc@google.com> Refactor large object sweeping.

Moved basic sweeping logic into large_object_space.cc.
Renamed SpaceSetMap -> ObjectSet.

Change-Id: I938c1f29f69b0682350347da2bd5de021c0e0224
e6da9af8dfe0a3e3fbc2be700554f6478380e7b9 16-Dec-2013 Mathieu Chartier <mathieuc@google.com> Background compaction support.

When the process state changes to a state which does not perceives
jank, we copy from the main free-list backed allocation space to
the bump pointer space and enable the semispace allocator.

When we transition back to foreground, we copy back to a free-list
backed space.

Create a seperate non-moving space which only holds non-movable
objects. This enables us to quickly wipe the current alloc space
(DlMalloc / RosAlloc) when we transition to background.

Added multiple alloc space support to the sticky mark sweep GC.

Added a -XX:BackgroundGC option which lets you specify
which GC to use for background apps. Passing in
-XX:BackgroundGC=SS makes the heap compact the heap for apps which
do not perceive jank.

Results:
Simple background foreground test:
0. Reboot phone, unlock.
1. Open browser, click on home.
2. Open calculator, click on home.
3. Open calendar, click on home.
4. Open camera, click on home.
5. Open clock, click on home.
6. adb shell dumpsys meminfo

PSS Normal ART:
Sample 1:
88468 kB: Dalvik
3188 kB: Dalvik Other
Sample 2:
81125 kB: Dalvik
3080 kB: Dalvik Other

PSS Dalvik:
Total PSS by category:
Sample 1:
81033 kB: Dalvik
27787 kB: Dalvik Other
Sample 2:
81901 kB: Dalvik
28869 kB: Dalvik Other

PSS ART + Background Compaction:
Sample 1:
71014 kB: Dalvik
1412 kB: Dalvik Other
Sample 2:
73859 kB: Dalvik
1400 kB: Dalvik Other

Dalvik other reduction can be explained by less deep allocation
stacks / less live bitmaps / less dirty cards.

TODO improvements: Recycle mem-maps which are unused in the current
state. Not hardcode 64 MB capacity of non movable space (avoid
returning linear alloc nightmares). Figure out ways to deal with low
virtual address memory problems.

Bug: 8981901

Change-Id: Ib235d03f45548ffc08a06b8ae57bf5bada49d6f3
692fafd9778141fa6ef0048c9569abd7ee0253bf 30-Nov-2013 Mathieu Chartier <mathieuc@google.com> Thread local bump pointer allocator.

Added a thread local allocator to the heap, each thread has three
pointers which specify the thread local buffer: start, cur, and
end. When the remaining space in the thread local buffer isn't large
enough for the allocation, the allocator allocates a new thread
local buffer using the bump pointer allocator.

The bump pointer space had to be modified to accomodate thread
local buffers. These buffers are called "blocks", where a block
is a buffer which contains a set of adjacent objects. Blocks
aren't necessarily full and may have wasted memory towards the
end. Blocks have an 8 byte header which specifies their size and is
required for traversing bump pointer spaces.

Memory usage is in between full bump pointer and ROSAlloc since
madvised memory limits wasted ram to an average of 1/2 page per
block.

Added a runtime option -XX:UseTLAB which specifies whether or
not to use the thread local allocator. Its a NOP if the garbage
collector is not the semispace collector.

TODO: Smarter block accounting to prevent us reading objects until
we either hit the end of the block or GetClass() == null which
signifies that the block isn't 100% full. This would provide a
slight speedup to BumpPointerSpace::Walk.

Timings: -XX:HeapMinFree=4m -XX:HeapMaxFree=8m -Xmx48m
ritzperf memalloc:
Dalvik -Xgc:concurrent: 11678
Dalvik -Xgc:noconcurrent: 6697
-Xgc:MS: 5978
-Xgc:SS: 4271
-Xgc:CMS: 4150
-Xgc:SS -XX:UseTLAB: 3255

Bug: 9986565
Bug: 12042213

Change-Id: Ib7e1d4b199a8199f3b1de94b0a7b6e1730689cad
b2f9936cab87a187f078187c22d9b29d4a188a62 21-Nov-2013 Mathieu Chartier <mathieuc@google.com> Add histogram for GC pause times.

Printed when you dump the GC performance info.

Bug: 10855285
Change-Id: I3bf7f958305f97c52cb31c03bdd6218c321575b9
cf58d4adf461eb9b8e84baa8019054c88cd8acc6 26-Sep-2013 Hiroshi Yamauchi <yamauchi@google.com> A custom 'runs-of-slots' memory allocator.

Bug: 9986565
Change-Id: I0eb73b9458752113f519483616536d219d5f798b
590fee9e8972f872301c2d16a575d579ee564bee 13-Sep-2013 Mathieu Chartier <mathieuc@google.com> Compacting collector.

The compacting collector is currently similar to semispace. It works by
copying objects back and forth between two bump pointer spaces. There
are types of objects which are "non-movable" due to current runtime
limitations. These are Classes, Methods, and Fields.

Bump pointer spaces are a new type of continuous alloc space which have
no lock in the allocation code path. When you allocate from these it uses
atomic operations to increase an index. Traversing the objects in the bump
pointer space relies on Object::SizeOf matching the allocated size exactly.

Runtime changes:
JNI::GetArrayElements returns copies objects if you attempt to get the
backing data of a movable array. For GetArrayElementsCritical, we return
direct backing storage for any types of arrays, but temporarily disable
the GC until the critical region is completed.

Added a new runtime call called VisitObjects, this is used in place of
the old pattern which was flushing the allocation stack and walking
the bitmaps.

Changed image writer to be compaction safe and use object monitor word
for forwarding addresses.

Added a bunch of added SIRTs to ClassLinker, MethodLinker, etc..

TODO: Enable switching allocators, compacting on background, etc..

Bug: 8981901

Change-Id: I3c886fd322a6eef2b99388d19a765042ec26ab99
a3d2718d1fac53210b2a311b1728409d6c8e7b9d 06-Nov-2013 Brian Carlstrom <bdc@google.com> Change thread.h to thread-inl.h to pick up missing Thread::Currnet for debug build in master

Change-Id: I56a4dd18ec1c212f9dbb73b14c0c0623b23c87bd
3f9667022788ba1effcd1e47fc9e3decc4db569d 05-Sep-2013 Mathieu Chartier <mathieuc@google.com> Add more systrace logging to GC.

There was some confusing systrace messages which made it seem like
pauses were longer than they actually were due to premption occuring
during thread_list->ResumeAll().

Bug: 10612142

Change-Id: I6eeedd1cf85ff38c5b116f15059469db52cbb73b
720ef7680573c1afd12f99f02eee3045daee5168 17-Aug-2013 Mathieu Chartier <mathieuc@google.com> Fix non concurrent GC ergonomics.

If we dont have concurrent GC enabled, we need to force GC for alloc
when we hit the maximum allowed footprint so that our heap doesn't
keep growing until it hits the growth limit.

Refactored a bit of stuff.

Change-Id: I8eceac4ef01e969fd286ebde3a735a09d0a6dfc1
94c32c5f01c7d44781317bf23933ed0a5bc4b796 09-Aug-2013 Mathieu Chartier <mathieuc@google.com> More parallel GC, rewritten parallel mark stack processing.

Card scanning may now be done in parallel. This speeds up sticky and
reduces pause times for all GC types.

Speedup on my mako (ritz perf):
Average pause time for sticky GC (~250 samples):
Without parallel cards scanning enabled: 2.524904215ms
Parallel card scanning (num_gc_threads_): 1.552123552ms
Throughput (~250 samples):
Sticky GC throughput with parallel card scanning: 69MB/s
Sticky GC throughput without parallel card scanning: 51MB/s

Rewrote the mark stack processing to be LIFO and use a prefetch queue
like the non parallel version.

Cleaned up some of the logcat printing for the activity manager
process state listening.

Added unlikely hints to object scanning since arrays and classes are
scanned much less often than normal objects.

Fixed a bug where the number of GC threads was clamped to 1 due to a
bool instead of a size_t.

Fixed a race condition when we added references to the reference
queues. Sharded the reference queue lock into one lock for each reference
type (weak, soft, phatom, finalizer).

Changed timing splits to be different for processing gray objects with
and without mutators paused since sticky GC does both.

Mask out the class bit when visiting fields as an optimization, this is
valid since classes are held live by the class linker.

Partially completed: Parallel recursive mark + finger.

Bug: 10245302
Bug: 9969166
Bug: 9986532
Bug: 9961698

Change-Id: I142d09718c4609b7c2387cb28f517a6983c73288
02e25119b15a6f619f17db99f5d05124a5807ff3 15-Aug-2013 Mathieu Chartier <mathieuc@google.com> Fix up TODO: c++0x, update cpplint.

Needed to update cpplint to handle const auto.

Fixed a few cpplint errors that were being missed before.

Replaced most of the TODO c++0x with ranged based loops. Loops which
do not have a descriptive container name have a concrete type instead
of auto.

Change-Id: Id7cc0f27030f56057c544e94277300b3f298c9c5
7940e44f4517de5e2634a7e07d58d0fb26160513 12-Jul-2013 Brian Carlstrom <bdc@google.com> Create separate Android.mk for main build targets

The runtime, compiler, dex2oat, and oatdump now are in seperate trees
to prevent dependency creep. They can now be individually built
without rebuilding the rest of the art projects. dalvikvm and jdwpspy
were already this way. Builds in the art directory should behave as
before, building everything including tests.

Change-Id: Ic6b1151e5ed0f823c3dd301afd2b13eb2d8feb81