History log of /art/runtime/gc/allocator/rosalloc.cc
Revision Date Author Comments
46ee31b67d7ee1bd085fbc240502053caa3cf8fa 14-Dec-2016 Andreas Gampe <agampe@google.com> ART: Move to libbase StringPrintf

Remove ART's StringPrintf implementation. Fix up clients. Add
missing includes where necessary.

Test: m test-art-host
Change-Id: I564038d5868595ac3bb88d641af1000cea940e5a
a9033d73a118536ece62c7f90d7f56064b4298ab 02-Dec-2016 Mathieu Chartier <mathieuc@google.com> Add more detail to rosalloc fragmentation OOME

Also include total number of free page bytes, space footprint, and
space max capacity.

Sample output:
Throwing OutOfMemoryError "Failed to allocate a 7012 byte allocation
with 103464 free bytes and 101KB until OOM; failed due to
fragmentation (required continguous free 8192 bytes, largest
contiguous free 4096 bytes, total free pages 4096 bytes, space
footprint 268435456 bytes, space max capacity 268435456 bytes)

Added a basic test to ensure the allocator coalesces properly.

Bug: 32997082

Test: test-art-host

Change-Id: I642b6ad34b98f6d98c10f242a6f6e926e0b42acc
709b070044354d9f47641f273edacaeeb0240ab7 13-Oct-2016 David Sehr <sehr@google.com> Remove mirror:: and ArtMethod deps in utils.{h,cc}

The latest chapter in the ongoing saga of attempting to dump a DEX
file without having to start a whole runtime instance. This episode
finds us removing references to ArtMethod/ArtField/mirror.

One aspect of this change that I would like to call out specfically
is that the utils versions of the "Pretty*" functions all were written
to accept nullptr as an argument. I have split these functions up as
follows:
1) an instance method, such as PrettyClass that obviously requires
this != nullptr.
2) a static method, that behaves the same way as the util method, but
calls the instance method if p != nullptr.
This requires using a full class qualifier for the static methods,
which isn't exactly beautiful. I have tried to remove as many cases
as possible where it was clear p != nullptr.

Bug: 22322814
Test: test-art-host
Change-Id: I21adee3614aa697aa580cd1b86b72d9206e1cb24
e74fe1e92de5edb4f4737d9f7ca7518c6abfe3c7 31-Aug-2016 Vladimir Marko <vmarko@google.com> Avoid decrementing iterator to std::set<>::begin() in RosAlloc.

Avoid undefined behavior in the expression
free_page_runs_.erase(it--);
when it == free_page_runs_.begin().

Also avoid the similar expression
free_page_runs_.erase(it++)
even though it's always well-defined.

In practice, the undefined behavior has no observable
side effects with the std::set<> implementation we use.
Therefore a regression test is not feasible.

Test: m test-art-host
Change-Id: I4bdeb6cdd068fe5da416b0e66953d5620ad5e999
bb661c0f0cb72d4bbfc2e251f6ded6949a713292 04-Apr-2016 Bilyan Borisov <bilyan.borisov@linaro.org> Refactor use of __ANDROID__ macro

We use the __ANDROID__ macro, which is provided by the toolchain, in
numerous places. This patch refactors the usage of this by defining a
new macro, ART_TARGET_ANDROID, that is being passed during build to
ART_TARGET_CFLAGS in Android.common_build.mk on the same line as
ART_TARGET. The codebase currently assumes that the existence of the
__ANDROID__ macro implies that we are compiling art for an android
target device. This is because, currently, target builds are compiled
with target toolchains that provide the macro, while host toolchains
do not. With this change this assumption is still preserved. However,
in a future patch we will add the ability to compile art for a linux
target, and in that case the ART_TARGET_ANDROID macro won't be passed
anymore.

Change-Id: I1f3a811aa735c87087d812da27fc6b08f01bad51
b62f2e6f3f8d66b3231ecec14ea9365733371b39 23-Mar-2016 Hiroshi Yamauchi <yamauchi@google.com> Add RosAlloc stats dump.

For better understanding of the RosAlloc space.

(cherrypick commit 565c2d9bce43c430d4267c82f5702160d971e712)

Bug: 27744947
Bug: 9986565

Change-Id: I8309761a68fbc143bbcd8458a9194085aace7c3e
fc067bfbcd00c65917b0a2f0964d7b40bd847903 23-Mar-2016 Hiroshi Yamauchi <yamauchi@google.com> Use smaller rosalloc run sizes.

Use 1/1/1/2/4 instead of 1/4/8/16/32 to save memory footprint.

No regressions in BinaryTrees, Ritz MemAlocTest, Ritz EAAC.

(cherrypick commit c867a275aa7a132ec1fd4f3b8c27812bda61ea73)

Bug: 27744947
Bug: 9986565
Change-Id: I8ac3fd23719e5cfcce7e5715a03f40701f3ff339
565c2d9bce43c430d4267c82f5702160d971e712 23-Mar-2016 Hiroshi Yamauchi <yamauchi@google.com> Add RosAlloc stats dump.

For better understanding of the RosAlloc space.

Bug: 27744947
Bug: 9986565
Change-Id: I02a8028b9728f6862e5e78588a368b8029bb5c1a
a16ff3c138747fc7ff62e11f1c85a3698c966559 23-Mar-2016 Hiroshi Yamauchi <yamauchi@google.com> Use smaller rosalloc run sizes.

Use 1/1/1/2/4 instead of 1/4/8/16/32 to save memory footprint.

No regressions in BinaryTrees, Ritz MemAlocTest, Ritz EAAC.

Bug: 27744947
Bug: 9986565
Change-Id: If26dfae073df86b8c6e6b411c22e50cd808599ef
b5e31f3dd5f792ff60225a4daa048a57d261cdd0 19-Feb-2016 Hiroshi Yamauchi <yamauchi@google.com> Fix rosalloc issues with valgrind.

The issue was that the MemoryToolMallocSpace constructor was
explicitly undefining the tail of the mem map, and when RosAlloc
expanded the space beyond the initial size, it gets errors from using
the zero but undefined (due to the above undefining) memory
content. RosAlloc zeroes memory on free (as opposed to zeroes on
allocation) and relied on the zero-initialized (hence defined in terms
of valgrind) mem map at the initialization time.

Change RosAlloc so that it does explicitly zeroes the entire mem map at
the initialization time and it does not rely on the zero-initialized of
the mem map.

Also, avoid explicitly changing the valgrind state in the
MemoryToolMallocSpace constructor, which happens after the allocator is
initialized because that may interfere with the allocator internal
initialization.

Bug: 27156726
Bug: 9986565

Change-Id: I3b36d2d987c25ce9ff5213278109c425f480b0d9
7ed9c561048d79083b6d0576c71a986a3123bca6 03-Feb-2016 Hiroshi Yamauchi <yamauchi@google.com> Use 8-byte increment bracket sizes for rosalloc thread local runs.

Very small space savings (< 1%) after device boot and up to 10%
allocation speedup.

Some minor cleanup.

Bug: 9986565

Change-Id: I51d791c4674d6944fe9a7ee78537ac3490c1a02c
578462103b548484e45079a749cb684f6a7e635c 11-Jun-2015 Lei Li <lei.l.li@intel.com> No need merging bulk free list again when revoking thread local runs

In RevokeThreadLocalRuns, the bracket index lock guards thread local free list
to avoid race condition with merging bulk free list to thread local free list
by GC thread in BulkFree. Thus the thread local list operation is atomic.

There are two cases when the mutator is unexpectedly terminated and then try to
revoke the thread local run for this mutator before termination:
1) Before termination, the thread local run is true, GC thread helps merge bulk
free list to thread local free list in last BulkFree. And the latest thread local
free list will be merged to free list either when this run is full or when revoking
this run in RevokeThreadLocalRuns. In this case the free list will finally be updated.
2) After termination, the thread local run is set to false, GC thread will help merge
bulk free list in the following BulkFree.
Thus no need to merge bulk free list to free list again in RevokeThreadLocalRuns.

Change-Id: I1a390f5356cd4cb5ca058216b9bf6e701cab9e77
Signed-off-by: Lei Li <lei.l.li@intel.com>
31bf42c48c4d00f0677c31264bba8d21618dae67 24-Sep-2015 Hiroshi Yamauchi <yamauchi@google.com> Use free lists instead of bitmaps within rosalloc runs.

Speedups (CMS GC/N5)
BinaryTrees: 2008 -> 1694 ms (-16%)
MemAllocTest: 2303 -> 2076 ms (-10%)

TODO: Add assembly fast path code.

Bug: 9986565

Change-Id: I9dd7cbfd8e1ae083a399e70abaf2064a959f24fa
c60e1b755c5632dfeb04c333489ede52ee5c945f 30-Jul-2015 Andreas Gampe <agampe@google.com> ART: Use __ANDROID__ instead of HAVE_ANDROID_OS

Use the proper define.

Change-Id: I71e291ac25f5d5f0187ac9b6ef2d6872f19e6085
14d90579f013b374638b599361970557ed4b3f09 16-Jul-2015 Roland Levillain <rpl@google.com> Use (D)CHECK_ALIGNED more.

Change-Id: I9d740f6a88d01e028d4ddc3e4e62b0a73ea050af
1e13374baf7dfaf442ffbf9809c37c131d681eaf 20-May-2015 Evgenii Stepanov <eugenis@google.com> Generalize Valgrind annotations in ART to support ASan.

Also add redzones around non-fixed mem_map(s).
Also extend -Wframe-larger-than limit to enable arm64 ASan build.

Change-Id: Ie572481a25fead59fc8978d2c317a33ac418516c
e48b29b2b99a6b4eceacdbdd60838ad7e789be79 16-Apr-2015 Dan Albert <danalbert@google.com> Prevent undefined behavior in RosAlloc.

In cases where remain == 0, the 32-bit value would be left shifted
32-bits, which is undefined behavior.

Change-Id: I6277279341b168536f928ce87375c395a1aa865c
2cebb24bfc3247d3e9be138a3350106737455918 22-Apr-2015 Mathieu Chartier <mathieuc@google.com> Replace NULL with nullptr

Also fixed some lines that were too long, and a few other minor
details.

Change-Id: I6efba5fb6e03eb5d0a300fddb2a75bf8e2f175cb
3481ba2c4e4f3aa80d8c6d50a9f85dacb56b508b 13-Apr-2015 Vladimir Marko <vmarko@google.com> ART: Clean up includes.

Reduce dependencies to improve incremental build times.
Break up circular dependency involving class_linker-inl.h.

Change-Id: I4be742c5c2b5cd9855beea86630fd68aab76b0db
4460a84be92b5a94ecfb5c650aef4945ab849c93 09-Mar-2015 Hiroshi Yamauchi <yamauchi@google.com> Rosalloc thread local allocation path without a cas.

Speedup on N4:
MemAllocTest 3044 -> 2396 (~21% reduction)
BinaryTrees 4101 -> 2929 (~26% reduction)

Bug: 9986565
Change-Id: Ia1d1a37b9e001f903c3c056e8ec68fc8c623a78b
5c42c29b89286e5efa4a4613132b09051ce5945b 25-Feb-2015 Vladimir Marko <vmarko@google.com> Add support for .bss section in oat files.

Change-Id: I779b80b8139d9afdc28373f8c68edff5df7726ce
fef16adc09a26aa8081c3e803bbc66e1947a97a0 20-Feb-2015 Andreas Gampe <agampe@google.com> ART: Fix RosAlloc Valgrind code

Large object verification needs to take the redzones into account
when checking the page size.

Change-Id: I0529e21d085e82f2c8a6d8552de1e7c1df3956bc
c38c5ea76dd3cfd44eec21df640161046ffc3e4c 05-Feb-2015 Mathieu Chartier <mathieuc@google.com> Clear thread local freed bits in RosAlloc::Run::InspectAllSlots

Previously we didn't take these bits into consideration. This could
cause RosAlloc::Run::InspectAllSlots to inspect recently freed
allocations as allocated.

Bug: 19193521
Change-Id: I56b3c089e2a36098423261cda623fc834069f832
3f3c6c030db14e47d3022f00403f46240623f339 20-Nov-2014 Hiroshi Yamauchi <yamauchi@google.com> Tune rosalloc buffer sizes.

We now use one-page buffers for size brackets 4-7, instead of two-page
buffers, and the first 8 size brackets for thread-local allocations,
instead of 11.

No slowdown observed with MemAllocTest, EvaluateAndApplyChanges, and
BinaryTrees.

(cherrypick commit c4cd95fa37b7138a0fa26d07c235aa409542aecd)

Bug: 18377775
Change-Id: I311f3adf9cab660d258833b17df7e6d905f73c72
c4cd95fa37b7138a0fa26d07c235aa409542aecd 20-Nov-2014 Hiroshi Yamauchi <yamauchi@google.com> Tune rosalloc buffer sizes.

We now use one-page buffers for size brackets 4-7, instead of two-page
buffers, and the first 8 size brackets for thread-local allocations,
instead of 11.

No slowdown observed with MemAllocTest, EvaluateAndApplyChanges, and
BinaryTrees.

Bug: 18377775

Change-Id: Ie1bb46bcf5d3729197e48e26a27da4cc39dd807e
d7576328811e5103e99d31f834a857522cc1463f 25-Oct-2014 Andreas Gampe <agampe@google.com> ART: Fix valgrind

Allow ValgrindMallocSpace wrapper for RosAlloc.Requires refactoring,
as ValgrindMallocSpace was bound to the signature of DlMallocSpace.

Also turn of native stack dumping when running under Valgrind to
work around b/18119146.

Ritzperf before and after
Mean 3190.725 3082.475
Standard Error 11.68407 10.37911
Mode 3069 2980
Median 3182.5 3051.5
Variance 16382.117 12927.125
Standard Deviation 127.99264 113.69751
Kurtosis 1.1065632 0.3657799
Skewness 0.9013805 0.9117792
Range 644 528
Minimum 2991 2928
Maximum 3635 3456
Count 120 120

Bug: 18119146
Change-Id: I25558ea7cb578406011dede9d3d0bdbfee4ff4d5
277ccbd200ea43590dfc06a93ae184a765327ad0 04-Nov-2014 Andreas Gampe <agampe@google.com> ART: More warnings

Enable -Wno-conversion-null, -Wredundant-decls and -Wshadow in general,
and -Wunused-but-set-parameter for GCC builds.

Change-Id: I81bbdd762213444673c65d85edae594a523836e5
6a3c1fcb4ba42ad4d5d142c17a3712a6ddd3866f 31-Oct-2014 Ian Rogers <irogers@google.com> Remove -Wno-unused-parameter and -Wno-sign-promo from base cflags.

Fix associated errors about unused paramenters and implict sign conversions.
For sign conversion this was largely in the area of enums, so add ostream
operators for the effected enums and fix tools/generate-operator-out.py.
Tidy arena allocation code and arena allocated data types, rather than fixing
new and delete operators.
Remove dead code.

Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
cf7f19135f0e273f7b0136315633c2abfc715343 23-Oct-2014 Ian Rogers <irogers@google.com> C++11 related clean-up of DISALLOW_..

Move DISALLOW_COPY_AND_ASSIGN to delete functions. By no having declarations
with no definitions this prompts better warning messages so deal with these
by correcting the code.
Add a DISALLOW_ALLOCATION and use for ValueObject and mirror::Object.
Make X86 assembly operand types ValueObjects to fix compilation errors.
Tidy the use of iostream and ostream.
Avoid making cutils a dependency via mutex-inl.h for tests that link against
libart. Push tracing dependencies into appropriate files and mutex.cc.
x86 32-bit host symbols size is increased for libarttest, avoid copying this
in run-test 115 by using symlinks and remove this test's higher than normal
ulimit.
Fix the RunningOnValgrind test in RosAllocSpace to not use GetHeap as it
returns NULL when the heap is under construction by Runtime.

Change-Id: Ia246f7ac0c11f73072b30d70566a196e9b78472b
c7dd295a4e0cc1d15c0c96088e55a85389bade74 22-Oct-2014 Ian Rogers <irogers@google.com> Tidy up logging.

Move gVerboseMethods to CompilerOptions. Now "--verbose-methods=" option to
dex2oat rather than runtime argument "-verbose-methods:".
Move ToStr and Dumpable out of logging.h, move LogMessageData into logging.cc
except for a forward declaration.
Remove ConstDumpable as Dump methods are all const (and make this so if not
currently true).
Make LogSeverity an enum and improve compile time assertions and type checking.
Remove log_severity.h that's only used in logging.h.
With system headers gone from logging.h, go add to .cc files missing system
header includes.
Also, make operator new in ValueObject private for compile time instantiation
checking.

Change-Id: I3228f614500ccc9b14b49c72b9821c8b0db3d641
2fdeecb890a353d3f17407cc1cb015e0a65c2220 16-Oct-2014 Maxim Kazantsev <maxim.kazantsev@intel.com> Rosalloc should print unreachable page map type

When rosalloc receives unexpected page map type, it is not
printed in error message because it has 'byte' type. When printed
to LOG(FATAL), it is interpreted as symbol (usually unprintable).
This patch allows to see unexpected page map types as integers.

Change-Id: Ic9d472f933862f4e2671904277990d8a83bc4c89
fc787ecd91127b2c8458afd94e5148e2ae51a1f5 10-Oct-2014 Ian Rogers <irogers@google.com> Enable -Wimplicit-fallthrough.

Falling through switch cases on a clang build must now annotate the fallthrough
with the FALLTHROUGH_INTENDED macro.
Bug: 17731372

Change-Id: I836451cd5f96b01d1ababdbf9eef677fe8fa8324
13735955f39b3b304c37d2b2840663c131262c18 08-Oct-2014 Ian Rogers <irogers@google.com> stdint types all the way!

Change-Id: I4e4ef3a2002fc59ebd9097087f150eaf3f2a7e08
58553c7fd89ce69857017322444265469bb6af62 17-Sep-2014 Mathieu Chartier <mathieuc@google.com> Add allocation tracking allocators to ROSAlloc

Used to monitor native memory usage, results are approximately
12-100KB memory per app.

Change-Id: If5a46cd8d543851948a8cb69487f3044965b44ce
f37a88b8e6db6c587fa449a12e40cb46be1689fc 10-Jul-2014 Zuo Wang <zuo.wang@intel.com> ART: Compacting ROS/DlMalloc spaces with semispace copy GC

Current semispace copy GC is mainly associated with bump pointer
spaces. Though it squeezes fragmentation most aggressively, an extra
copy is required to re-establish the data in the ROS/DlMalloc space to allow
CMS GCs to happen afterwards. As semispace copy GC is still stop-the-world,
this not only introduces unnecessary overheads but also longer response time.
Response time indicates the time duration between the start of transition
request and the start of transition animation, which may impact the user
experience.

Using semispace copy GC to compact the data in a ROS space to another ROS(or
DlMalloc space to another DlMalloc) space solves this problem. Although it
squeezes less fragmentation, CMS GCs can run immediately after the compaction.

We apply this algorithm in two cases:
1) Right before throwing an OOM if -XX:EnableHSpaceCompactForOOM is passed in
as true.
2) When app is switched to background if the -XX:BackgroundGC option has value
HSpaceCompact.

For case 1), OOMs are significantly delayed in the harmony GC stress test,
with compaction ratio up to 0.87. For case 2), compaction ratio around 0.5 is
observed in both built-in SMS and browser. Similar results have been obtained
on other apps as well.

Change-Id: Iad9eabc6d046659fda3535ae20f21bc31f89ded3
Signed-off-by: Wang, Zuo <zuo.wang@intel.com>
Signed-off-by: Chang, Yang <yang.chang@intel.com>
Signed-off-by: Lei Li <lei.l.li@intel.com>
Signed-off-by: Lin Zang <lin.zang@intel.com>
e28ed99750554e0314b893dd2aa88ff9c3ca4500 10-Jul-2014 Mathieu Chartier <mathieuc@google.com> Fix race condition in release pages.

There was a race condition where another thread could coalesce the
free page run before we acquired the lock. In that case
free_page_runs_.find will not find a run starting at fpr. Added a
condition to handle this case. Also added handling for free page
runs with begin with released pages but end with empty pages.

Bug: 16191993
Change-Id: Ib12fdac8c246eae29c36f6a6728eb11d85553bbb
654dd48e2230e16bfaa225decce72b52642e2f78 09-Jul-2014 Hiroshi Yamauchi <yamauchi@google.com> Improve the OOME fragmentation message.

Change-Id: I390d3622f8d572ec7e34ea6dff9e1e0936e81ac1
a5b5c55c8585b7ce915f0c7e1f66d121a7f7a078 24-Jun-2014 Mathieu Chartier <mathieuc@google.com> Add notion of released vs empty pages to ROSAlloc.

A notion of released vs empty pages helps get a more accurate view of
how much memory was released during heap trimming. Otherwise we get
that the same pages possibly get madvised multiple times without
getting dirtied.

Also enabled heap trimming of rosalloc spaces even when we care about
jank. This is safe to do since the trimming process only acquires
locks for short periods of time.

Dalvik PSS reduces from ~52M to ~50M after boot on N4.

Bug: 9969166

Change-Id: I4012e0a2554f413d18efe1a0371fe18d1edabaa9
f5997b4d3f889569d5a2b724d83d764bfbb8d106 20-Jun-2014 Mathieu Chartier <mathieuc@google.com> More advanced timing loggers.

The new timing loggers have lower overhead since they only push into
a vector. The new format has two types, a start timing and a stop
timing. You can thing of these as brackets associated with a
timestamp. It uses these to construct various statistics when needed,
such as: Total time, exclusive time, and nesting depth.

Changed PrettyDuration to have a default of 3 digits after the decimal
point.

Exaple of a GC dump with exclusive / total times and indenting:
I/art (23546): GC iteration timing logger [Exclusive time] [Total time]
I/art (23546): 0ms InitializePhase
I/art (23546): 0.305ms/167.746ms MarkingPhase
I/art (23546): 0ms BindBitmaps
I/art (23546): 0ms FindDefaultSpaceBitmap
I/art (23546): 0ms/1.709ms ProcessCards
I/art (23546): 0.183ms ImageModUnionClearCards
I/art (23546): 0.916ms ZygoteModUnionClearCards
I/art (23546): 0.610ms AllocSpaceClearCards
I/art (23546): 1.373ms AllocSpaceClearCards
I/art (23546): 0.305ms/6.318ms MarkRoots
I/art (23546): 2.106ms MarkRootsCheckpoint
I/art (23546): 0.153ms MarkNonThreadRoots
I/art (23546): 4.287ms MarkConcurrentRoots
I/art (23546): 43.461ms UpdateAndMarkImageModUnionTable
I/art (23546): 0ms/112.712ms RecursiveMark
I/art (23546): 112.712ms ProcessMarkStack
I/art (23546): 0.610ms/2.777ms PreCleanCards
I/art (23546): 0.305ms/0.855ms ProcessCards
I/art (23546): 0.153ms ImageModUnionClearCards
I/art (23546): 0.610ms ZygoteModUnionClearCards
I/art (23546): 0.610ms AllocSpaceClearCards
I/art (23546): 0.549ms AllocSpaceClearCards
I/art (23546): 0.549ms MarkRootsCheckpoint
I/art (23546): 0.610ms MarkNonThreadRoots
I/art (23546): 0ms MarkConcurrentRoots
I/art (23546): 0.610ms ScanGrayImageSpaceObjects
I/art (23546): 0.305ms ScanGrayZygoteSpaceObjects
I/art (23546): 0.305ms ScanGrayAllocSpaceObjects
I/art (23546): 1.129ms ScanGrayAllocSpaceObjects
I/art (23546): 0ms ProcessMarkStack
I/art (23546): 0ms/0.977ms (Paused)PausePhase
I/art (23546): 0.244ms ReMarkRoots
I/art (23546): 0.672ms (Paused)ScanGrayObjects
I/art (23546): 0ms (Paused)ProcessMarkStack
I/art (23546): 0ms/0.610ms SwapStacks
I/art (23546): 0.610ms RevokeAllThreadLocalAllocationStacks
I/art (23546): 0ms PreSweepingGcVerification
I/art (23546): 0ms/10.621ms ReclaimPhase
I/art (23546): 0.610ms/0.702ms ProcessReferences
I/art (23546): 0.214ms/0.641ms EnqueueFinalizerReferences
I/art (23546): 0.427ms ProcessMarkStack
I/art (23546): 0.488ms SweepSystemWeaks
I/art (23546): 0.824ms/9.400ms Sweep
I/art (23546): 0ms SweepMallocSpace
I/art (23546): 0.214ms SweepZygoteSpace
I/art (23546): 0.122ms SweepMallocSpace
I/art (23546): 6.226ms SweepMallocSpace
I/art (23546): 0ms SweepMallocSpace
I/art (23546): 2.144ms SweepLargeObjects
I/art (23546): 0.305ms SwapBitmaps
I/art (23546): 0ms UnBindBitmaps
I/art (23546): 0.275ms FinishPhase
I/art (23546): GC iteration timing logger: end, 178.971ms

Change-Id: Ia55b65609468f212b3cd65cda66b843da42be645
a1c1c71e24c93a720bbf13de129c75a9a0bde37a 24-Jun-2014 Mathieu Chartier <mathieuc@google.com> Use reader lock of bulk free lock when not freeing.

Should help reduce contention observed in systrace.

Change-Id: Iadb81728d4ba797c3a68acea795b15d7f212e89b
c5f17732d8144491c642776b6b48c85dfadf4b52 06-Jun-2014 Ian Rogers <irogers@google.com> Remove deprecated WITH_HOST_DALVIK.

Bug: 13751317
Fix the Mac build:
- disable x86 selector removal that causes OS/X 10.9 kernel panics,
- madvise don't need does zero memory on the Mac, factor into MemMap
routine,
- switch to the elf.h in elfutils to avoid Linux kernel dependencies,
- we can't rely on exclusive_owner_ being available from other pthread
libraries so maintain our own when futexes aren't available (we
can't rely on the OS/X 10.8 hack any more),
- fix symbol naming in assembly code,
- work around C library differences,
- disable backtrace in DumpNativeStack to avoid a broken libbacktrace
dependency,
- disable main thread signal handling logic,
- align the stack in stub_test,
- use $(HOST_SHLIB_SUFFIX) rather than .so in host make file variables.

Not all host tests are passing on the Mac with this change. dex2oat
works as does running HelloWorld.
Change-Id: I5a232aedfb2028524d49daa6397a8e60f3ee40d3
700a402244a1a423da4f3ba8032459f4b65fa18f 20-May-2014 Ian Rogers <irogers@google.com> Now we have a proper C++ library, use std::unique_ptr.

Also remove the Android.libcxx.mk and other bits of stlport compatibility
mechanics.

Change-Id: Icdf7188ba3c79cdf5617672c1cfd0a68ae596a61
5fcfa7d9d97246f7eb48a74356cb00ec2cbc0181 15-May-2014 Ian Rogers <irogers@google.com> Move RoS allocator to use unordered_set.

Work-around existing stlport issues for the target. This will go away when the
target is using libc++.

Change-Id: I8f213ecd9dc7d93d17f4a0d7e84182c12af6ca1b
52cf5c09ab706f8bbd2a2cf2fb1ef1041f020314 02-May-2014 Hiroshi Yamauchi <yamauchi@google.com> Add inline to RosAlloc::AllocFromCurrentRunUnlocked().

This appears to make MemAllocTest faster by up to ~200 ms.

Bug: 14493155
Bug: 9986565
Change-Id: I1b8ba1f3cecfa9e5b6fdc53f3ae00450be6023a8
0651d41e41341fb2e9ef3ee41dc1f1bfc832dbbb 29-Apr-2014 Mathieu Chartier <mathieuc@google.com> Add thread unsafe allocation methods to spaces.

Used by SS/GSS collectors since these run with mutators suspended and
only allocate from a single thread. Added AllocThreadUnsafe to
BumpPointerSpace and RosAllocSpace. Added AllocThreadUnsafe which uses
current runs as thread local runs for a thread unsafe allocation.
Added code to revoke current runs which are the same idx as thread
local runs.

Changed:
The number of thread local runs in each thread is now the the number
of thread local runs in RosAlloc instead of the number of size
brackets.

Total GC time / time on EvaluateAndApplyChanges.
TLAB SS:
Before: 36.7s / 7254
After: 16.1s / 4837

TLAB GSS:
Before: 6.9s / 3973
After: 5.7s / 3778

Bug: 8981901

Change-Id: Id1d264ade3799f431bf7ebbdcca6146aefbeb632
4fd2050ff1da3892b5e79276f831e09b2f8e5bc9 28-Apr-2014 Mathieu Chartier <mathieuc@google.com> Fix racy DCHECKS.

Added a size bracket lock to fix a race condition where we were
inserting into full_runs_ and non_full_runs_ holding a lock but
reading without holding a lock from a different thread.

Bug: 14326370
Change-Id: I5c492bddc4b9927e4a36603f3d787b046961675d
73d1e17b3afc7d5e56184f90bf819dc64956448a 12-Apr-2014 Mathieu Chartier <mathieuc@google.com> Enable reading page map without lock in RosAlloc::BulkFree

Enabling this flag greatly reduces how much time was spent in the GC.
It was not done previously since it was regressing MemAllocTest. With
these RosAlloc changes, the benchmark score no longer regresses after
we enable the flag.

Changed Run::AllocSlot to only have one mode of allocation. The new
mode is finding the first free bit in the bitmap. This was
previously the slow path but is now the fast path. Some optimizations
which enabled this include always having the alloc bitmap bits which
correspond to invalid slots be set to 1. This prevents us from needing
a bound check since we will never end up allocating there.

Changed revoking thread local buffer to point to an invalid run. The
invalid run is just a run which always has all the allocation bits set
to 1. When a thread attempts to do a thread local allocation from here
it will always fail and go slow path. This eliminates the need for a
null check for revoked runs.

Changed zeroing of memory to happen during free, AllocPages should
always return zeroed memory. Added prefetching which happens when we
allocate a run.

Some refactoring to reduce duplicated code.

Ergonomics changes: Changed kStickyGcThroughputAdjustment to 1.0,
this helps reduce GC time.

Measurements (3 samples per benchmark):
Before: MemAllocTest scores: 3463, 3445, 3431
EvaluateAndApplyChanges score | total GC time
Iter 1: 3485, 23.602436s
Iter 2: 3434, 22.499882s
Iter 3: 3483, 23.253274s

After: MemAllocTest scores: 3495, 3417, 3409
EvaluateAndApplyChanges score | total GC time:
Iter 1: 3375, 17.463462s
Iter 2: 3358, 16.185188s
Iter 3: 3367, 15.822312s

Bug: 8788501
Bug: 11790317
Bug: 9986565
Change-Id: Ifd273a054824028dabed27c07c081dde1816f93c
8585bad7be19ee4901333f7d02d1d4d3f04877d4 12-Apr-2014 Mathieu Chartier <mathieuc@google.com> Return bytes freed from RosAlloc.

There was a problem with how RosAlloc space sweeping worked caused by
using the object size in the FreeList call, this won't work well with
class unloading since the object's class may be freed before the
object.

Bug: 13989231
Change-Id: I3df439c312310720fd34249334dec85030166fe9
d9a88de76de4c81ad75340b824df64a68c739351 07-Apr-2014 Hiroshi Yamauchi <yamauchi@google.com> Implement rosalloc page trimming without suspending threads.

Also, making it more efficient by not going through the chunks smaller
than the page size by not using InspectAll().

Change-Id: I79ceb0374cb8aba5f6b8dde1afbace9af98b6cff
4cd662e54440f76fc920cb2c67acab3bba8b33dd 04-Apr-2014 Hiroshi Yamauchi <yamauchi@google.com> Fix Object::Clone()'s pre-fence barrier.

Pass in a pre-fence barrier object that sets in the array length
instead of setting it after returning from AllocObject().

Fix another potential bug due to the wrong default pre-fence barrier
parameter value. Since this appears error-prone, removed the default
parameter value and make it an explicit parameter.

Fix another potential moving GC bug due to a lack of a SirtRef.

Bug: 13097759
Change-Id: I466aa0e50f9e1a5dbf20be5a195edee619c7514e
dd7624d2b9e599d57762d12031b10b89defc9807 15-Mar-2014 Ian Rogers <irogers@google.com> Allow mixing of thread offsets between 32 and 64bit architectures.

Begin a more full implementation x86-64 REX prefixes.
Doesn't implement 64bit thread offset support for the JNI compiler.

Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
3c79a30cc6eeed11b80e61119595b7a586b36435 03-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> Fix a race condition in RosAlloc.

Fix a race condition in RosAlloc between RosAlloc::BulkFree() and
RosAlloc::RevokeThreadLocalRuns() with regard to bulk_free_bit_map.

(Cherry-pick commit 70f60042558e0a766e98f2aaefbf80596ace4d53)

Bug: 13192845
Bug: 12592026
Change-Id: I21afe8f9f3ad4166762d2bbfa0a6ae9484d8b12b
c93c530efc175954160c3834c93961a1a946a35a 21-Mar-2014 Hiroshi Yamauchi <yamauchi@google.com> Revoke rosalloc thread-local buffers at the checkpoint.

In the mark sweep collector, rosalloc thread-local buffers were
revoked during the pause. Now, they are revoked at the thread
checkpoint, as opposed to during the pause, which appears to help
reduce the pause time.

In Ritz MemAllocTest, the average sticky pause time went down ~20%
(925 us -> 724 us).

Bug: 13394464
Bug: 9986565
Change-Id: I104992a11b46d59264c0b9aa2db82b1ccf2826bc
5d057056db1923947ba846b391d981759b15714a 12-Mar-2014 Ian Rogers <irogers@google.com> Improve use of CHECK macros.

Motivated by a bogus compiler warning for debug with O2, switch from
CHECK(x < y) to the recommended CHECK_LT(x, y). Fix bug in
RosAlloc::Initialize where an assignment was being performed within
a DCHECK.

Change-Id: Iaf466849ae79ae1497162e81a3e092bf13109aa9
661974a5561e5ccdfbac8cb5d8df8b7e6f3483b8 09-Jan-2014 Mathieu Chartier <mathieuc@google.com> Fix valgrind gtests and memory leaks.

All tests pass other than image_test which passes if some bad reads
are disabled (buzbee working on this).

Change-Id: Ifd6b6e3aed0bc867703b6e818353a9f296609422
26d69ffc0ebc98fbc5f316d8cd3ee6ba5b2001ac 27-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> Decrease lock uses in RosAlloc::BulkFree().

Read rosalloc page map entries without a lock.

Disabled for now.

This change speeds up Ritz MemAllocTest by ~25% on host and reduces
the GC sweep time, but somehow slows it down by ~5% on N4, which is
why it's disabled for now. TODO: look into the slowdown on N4 more.

Bug: 8262791
Bug: 11790317
Change-Id: I936bbee9cfbd389e70d6343503bf0923865d2a2c
f5b0e20b5b31f5f5465784adcf2a204dcd69c7fd 12-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> Thread-local allocation stack.

With this change, Ritz MemAllocTest gets ~14% faster on N4.

Bug: 9986565
Change-Id: I2fb7d6f7c5daa63dd4fc73ba739e6ae4ed820617
a4adbfd44032d70e166e6f18096bbbed05a990ba 05-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> RosAlloc verification.

If enabled, RosAlloc verification checks the allocator internal
metadata and invariants to detect bugs, heap corruptions, and race
conditions. Added runtime options for enabling and disabling
it. Enable it for the debug build.

Bug: 9986565
Bug: 12592026
Change-Id: I923742b87805ae839f1549d78d0d492733da6a58
70f60042558e0a766e98f2aaefbf80596ace4d53 03-Feb-2014 Hiroshi Yamauchi <yamauchi@google.com> Fix a race condition in RosAlloc.

Fix a race condition in RosAlloc between RosAlloc::BulkFree() and
RosAlloc::RevokeThreadLocalRuns() with regard to bulk_free_bit_map.

Change-Id: I128917d5bdfe2dab604174ca4cbe228282578b8a
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Bug: 12592026
573f7d2d68e1838a0485e6b40d90c967526e00c2 17-Dec-2013 Hiroshi Yamauchi <yamauchi@google.com> Fix an array copy benchmark regression.

Add different page release modes to rosalloc.

Bug: 12064551
Change-Id: Ib837bbd1a2757741a4e2743e0a1272bf46a30252
218daa2d876c5989f956e8e54b8f28f33d11b31f 25-Nov-2013 Brian Carlstrom <bdc@google.com> Change thread.h to thread-inl.h for missing Thread::Current for rosalloc.cc

Change-Id: Ieded9c5c93839c8f3eb5c5427229743a4c45c5ee
3c2856e939f3daa7a95a1f8cb70f47e7a621db3c 22-Nov-2013 Hiroshi Yamauchi <yamauchi@google.com> Inline RosAlloc::Alloc().

Bug: 9986565
Change-Id: I9bc411b8ae39379f9d730f40974857a585405fde
cf58d4adf461eb9b8e84baa8019054c88cd8acc6 26-Sep-2013 Hiroshi Yamauchi <yamauchi@google.com> A custom 'runs-of-slots' memory allocator.

Bug: 9986565
Change-Id: I0eb73b9458752113f519483616536d219d5f798b