History log of /external/jemalloc/include/jemalloc/internal/tcache.h
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
dff860754ebc4a698c8ea257de383ffc398a5a48 20-May-2015 Jason Evans <jasone@canonware.com> Impose a minimum tcache count for small size classes.

Now that small allocation runs have fewer regions due to run metadata
residing in chunk headers, an explicit minimum tcache count is needed to
make sure that tcache adequately amortizes synchronization overhead.

Bug: 21326736
(cherry picked from commit 83d543f8689bc7c6142179a5491bdf2a31b5cfc7)

Change-Id: I4178902b63ed310100019fee0805a11839de740f
/external/jemalloc/include/jemalloc/internal/tcache.h
83e5767ee9a8c68150cca06ae0d27a13ba4fcaf8 22-Apr-2015 Christopher Ferris <cferris@google.com> Revert "Revert "Merge remote-tracking branch 'aosp/upstream-dev' into merge""

This reverts commit 75929a97332565c3b987986f35652b6d5d275d3c.

The original failure this was reverted for seems to have been a bug somewhere else.

Change-Id: Ib29ba03b1b967f940dc19eceac2aa1d2923be1eb
/external/jemalloc/include/jemalloc/internal/tcache.h
75929a97332565c3b987986f35652b6d5d275d3c 16-Apr-2015 Nicolas Geoffray <ngeoffray@google.com> Revert "Merge remote-tracking branch 'aosp/upstream-dev' into merge"

Revert due to random ART crashes seen.

This reverts commit 5b5d97b42e84c2ac417271c3fab6fc282496a335.

Change-Id: I62a784301fded7ee853b182d172be46bb32bded7
/external/jemalloc/include/jemalloc/internal/tcache.h
5b5d97b42e84c2ac417271c3fab6fc282496a335 16-Apr-2015 Christopher Ferris <cferris@google.com> Merge remote-tracking branch 'aosp/upstream-dev' into merge

Change-Id: If743a1d002b1793c08a66c0bbd5c2c3eedcebe64
41cfe03f39740fe61cf46d86982f66c24168de32 14-Feb-2015 Jason Evans <je@fb.com> If MALLOCX_ARENA(a) is specified, use it during tcache fill.
/external/jemalloc/include/jemalloc/internal/tcache.h
1cb181ed632e7573fb4eab194e4d216867222d27 30-Jan-2015 Jason Evans <je@fb.com> Implement explicit tcache support.

Add the MALLOCX_TCACHE() and MALLOCX_TCACHE_NONE macros, which can be
used in conjunction with the *allocx() API.

Add the tcache.create, tcache.flush, and tcache.destroy mallctls.

This resolves #145.
/external/jemalloc/include/jemalloc/internal/tcache.h
2c5cb613dfbdf58f88152321b63e60c58cd23972 08-Dec-2014 Guilherme Goncalves <guilherme.p.gonc@gmail.com> Introduce two new modes of junk filling: "alloc" and "free".

In addition to true/false, opt.junk can now be either "alloc" or "free",
giving applications the possibility of junking memory only on allocation
or deallocation.

This resolves #172.
/external/jemalloc/include/jemalloc/internal/tcache.h
e12eaf93dca308a426c182956197b0eeb5f2cff3 08-Dec-2014 Jason Evans <je@fb.com> Style and spelling fixes.
/external/jemalloc/include/jemalloc/internal/tcache.h
fb795867f0b3aa28bbdf177e1026f3e3408e0338 14-Nov-2014 Christopher Ferris <cferris@google.com> Tune the jemalloc to reign in PSS.

The tcache in jemalloc can take up quite a bit of extra PSS. Disabling
the tcache can save a lot of PSS, but it radically reduces performance.

Tune the number of small and large values to store in the tcache.
Immediately force any dirty pages to be purged, rather than keep some
number of dirty pages around.

Restore the chunk size back to 4MB. Using this chunk size and the force
dirty page results in a higher cf-bench native mallocs score but about
the same amount of PSS use.

Limit the number of arenas to 2. The default is 2 * number of cpus, but
that increases the amount of PSS used. My benchmarking indicates that
more than 2 really doesn't help too much even on a device with 4 cpus.
Nearly all speed-ups come from the tcache.

Bug: 17498287

Change-Id: I23b23dd88288c90e002a0a04684fb06dbf4ee742
/external/jemalloc/include/jemalloc/internal/tcache.h
fc0b3b7383373d66cfed2cd4e2faa272a6868d32 10-Oct-2014 Jason Evans <jasone@canonware.com> Add configure options.

Add:
--with-lg-page
--with-lg-page-sizes
--with-lg-size-class-group
--with-lg-quantum

Get rid of STATIC_PAGE_SHIFT, in favor of directly setting LG_PAGE.

Fix various edge conditions exposed by the configure options.
/external/jemalloc/include/jemalloc/internal/tcache.h
8bb3198f72fc7587dc93527f9f19fb5be52fa553 08-Oct-2014 Jason Evans <jasone@canonware.com> Refactor/fix arenas manipulation.

Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.

Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.

Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).

Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).

Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
/external/jemalloc/include/jemalloc/internal/tcache.h
155bfa7da18cab0d21d87aa2dce4554166836f5d 06-Oct-2014 Jason Evans <jasone@canonware.com> Normalize size classes.

Normalize size classes to use the same number of size classes per size
doubling (currently hard coded to 4), across the intire range of size
classes. Small size classes already used this spacing, but in order to
support this change, additional small size classes now fill [4 KiB .. 16
KiB). Large size classes range from [16 KiB .. 4 MiB). Huge size
classes now support non-multiples of the chunk size in order to fill (4
MiB .. 16 MiB).
/external/jemalloc/include/jemalloc/internal/tcache.h
16854ebeb77c9403ebd1b85fdd46ee80bb3f3e9d 05-Oct-2014 Jason Evans <jasone@canonware.com> Don't disable tcache for lazy-lock.

Don't disable tcache when lazy-lock is configured. There already exists
a mechanism to disable tcache, but doing so automatically due to
lazy-lock causes surprising performance behavior.
/external/jemalloc/include/jemalloc/internal/tcache.h
029d44cf8b22aa7b749747bfd585887fb59e0030 04-Oct-2014 Jason Evans <jasone@canonware.com> Fix tsd cleanup regressions.

Fix tsd cleanup regressions that were introduced in
5460aa6f6676c7f253bfcb75c028dfd38cae8aaf (Convert all tsd variables to
reside in a single tsd structure.). These regressions were twofold:

1) tsd_tryget() should never (and need never) return NULL. Rename it to
tsd_fetch() and simplify all callers.
2) tsd_*_set() must only be called when tsd is in the nominal state,
because cleanup happens during the nominal-->purgatory transition,
and re-initialization must not happen while in the purgatory state.
Add tsd_nominal() and use it as needed. Note that tsd_*{p,}_get()
can still be used as long as no re-initialization that would require
cleanup occurs. This means that e.g. the thread_allocated counter
can be updated unconditionally.
/external/jemalloc/include/jemalloc/internal/tcache.h
551ebc43647521bdd0bc78558b106762b3388928 03-Oct-2014 Jason Evans <jasone@canonware.com> Convert to uniform style: cond == false --> !cond
/external/jemalloc/include/jemalloc/internal/tcache.h
5460aa6f6676c7f253bfcb75c028dfd38cae8aaf 23-Sep-2014 Jason Evans <jasone@canonware.com> Convert all tsd variables to reside in a single tsd structure.
/external/jemalloc/include/jemalloc/internal/tcache.h
9c640bfdd4e2f25180a32ed3704ce8e4c4cc21f1 12-Sep-2014 Jason Evans <jasone@canonware.com> Apply likely()/unlikely() to allocation/deallocation fast paths.
/external/jemalloc/include/jemalloc/internal/tcache.h
23fdf8b359a690f457c5300338f4994d06402b95 09-Sep-2014 Daniel Micay <danielmicay@gmail.com> mark some conditions as unlikely

* assertion failure
* malloc_init failure
* malloc not already initialized (in malloc_init)
* running in valgrind
* thread cache disabled at runtime

Clang and GCC already consider a comparison with NULL or -1 to be cold,
so many branches (out-of-memory) are already correctly considered as
cold and marking them is not important.
/external/jemalloc/include/jemalloc/internal/tcache.h
3541a904d6fb949f3f0aea05418ccce7cbd4b705 17-Apr-2014 Jason Evans <je@fb.com> Refactor small_size2bin and small_bin2size.

Refactor small_size2bin and small_bin2size to be inline functions rather
than directly accessed arrays.
/external/jemalloc/include/jemalloc/internal/tcache.h
3e3caf03af6ca579e473ace4daf25f63102aca4f 17-Apr-2014 Jason Evans <jasone@canonware.com> Merge pull request #73 from bmaurer/smallmalloc

Smaller malloc hot path
021136ce4db79f50031a1fd5dd751891888fbc7b 16-Apr-2014 Ben Maurer <bmaurer@fb.com> Create a const array with only a small bin to size map
/external/jemalloc/include/jemalloc/internal/tcache.h
a7619b7fa56f98d1ca99a23b458696dd37c12b77 15-Apr-2014 Ben Maurer <bmaurer@fb.com> outline rare tcache_get codepaths
/external/jemalloc/include/jemalloc/internal/tcache.h
bd87b01999416ec7418ff8bdb504d9b6c009ff68 16-Apr-2014 Jason Evans <je@fb.com> Optimize Valgrind integration.

Forcefully disable tcache if running inside Valgrind, and remove
Valgrind calls in tcache-specific code.

Restructure Valgrind-related code to move most Valgrind calls out of the
fast path functions.

Take advantage of static knowledge to elide some branches in
JEMALLOC_VALGRIND_REALLOC().
/external/jemalloc/include/jemalloc/internal/tcache.h
9b0cbf0850b130a9b0a8c58bd10b2926b2083510 11-Apr-2014 Jason Evans <je@fb.com> Remove support for non-prof-promote heap profiling metadata.

Make promotion of sampled small objects to large objects mandatory, so
that profiling metadata can always be stored in the chunk map, rather
than requiring one pointer per small region in each small-region page
run. In practice the non-prof-promote code was only useful when using
jemalloc to track all objects and report them as leaks at program exit.
However, Valgrind is at least as good a tool for this particular use
case.

Furthermore, the non-prof-promote code is getting in the way of
some optimizations that will make heap profiling much cheaper for the
predominant use case (sampling a small representative proportion of all
allocations).
/external/jemalloc/include/jemalloc/internal/tcache.h
6e62984ef6ca4312cf0a2e49ea2cc38feb94175b 16-Dec-2013 Jason Evans <jasone@canonware.com> Don't junk-fill reallocations unless usize changes.

Don't junk fill reallocations for which the request size is less than
the current usable size, but not enough smaller to cause a size class
change. Unlike malloc()/calloc()/realloc(), *allocx() contractually
treats the full usize as the allocation, so a caller can ask for zeroed
memory via mallocx() and a series of rallocx() calls that all specify
MALLOCX_ZERO, and be assured that all newly allocated bytes will be
zeroed and made available to the application without danger of allocator
mutation until the size class decreases enough to cause usize reduction.
/external/jemalloc/include/jemalloc/internal/tcache.h
dda90f59e2b67903668a2799970f64df163e9ccf 20-Oct-2013 Jason Evans <jasone@canonware.com> Fix a Valgrind integration flaw.

Fix a Valgrind integration flaw that caused Valgrind warnings about
reads of uninitialized memory in internal zero-initialized data
structures (relevant to tcache and prof code).
/external/jemalloc/include/jemalloc/internal/tcache.h
06912756cccd0064a9c5c59992dbac1cec68ba3f 01-Feb-2013 Jason Evans <je@fb.com> Fix Valgrind integration.

Fix Valgrind integration to annotate all internally allocated memory in
a way that keeps Valgrind happy about internal data structure access.
/external/jemalloc/include/jemalloc/internal/tcache.h
88393cb0eb9a046000d20809809d4adac11957ab 22-Jan-2013 Jason Evans <jasone@canonware.com> Add and use JEMALLOC_ALWAYS_INLINE.

Add JEMALLOC_ALWAYS_INLINE and use it to guarantee that the entire fast
paths of the primary allocation/deallocation functions are inlined.
/external/jemalloc/include/jemalloc/internal/tcache.h
38067483c542adfe092644d1ecc103c6bc74add0 22-Jan-2013 Jason Evans <jasone@canonware.com> Tighten valgrind integration.

Tighten valgrind integration such that immediately after memory is
validated or zeroed, valgrind is told to forget the memory's 'defined'
state. The only place newly allocated memory should be left marked as
'defined' is in the public functions (e.g. calloc() and realloc()).
/external/jemalloc/include/jemalloc/internal/tcache.h
203484e2ea267e068a68fd2922263f0ff1d5ac6f 02-May-2012 Jason Evans <je@fb.com> Optimize malloc() and free() fast paths.

Embed the bin index for small page runs into the chunk page map, in
order to omit [...] in the following dependent load sequence:
ptr-->mapelm-->[run-->bin-->]bin_info

Move various non-critcal code out of the inlined function chain into
helper functions (tcache_event_hard(), arena_dalloc_small(), and
locking).
/external/jemalloc/include/jemalloc/internal/tcache.h
f7088e6c992d079bc3162e0c48ed4dc5def6d263 20-Apr-2012 Jason Evans <je@fb.com> Make arena_salloc() an inline function.
/external/jemalloc/include/jemalloc/internal/tcache.h
122449b073bcbaa504c4f592ea2d733503c272d2 06-Apr-2012 Jason Evans <je@fb.com> Implement Valgrind support, redzones, and quarantine.

Implement Valgrind support, as well as the redzone and quarantine
features, which help Valgrind detect memory errors. Redzones are only
implemented for small objects because the changes necessary to support
redzones around large and huge objects are complicated by in-place
reallocation, to the point that it isn't clear that the maintenance
burden is worth the incremental improvement to Valgrind support.

Merge arena_salloc() and arena_salloc_demote().

Refactor i[v]salloc() to expose the 'demote' option.
/external/jemalloc/include/jemalloc/internal/tcache.h
01b3fe55ff3ac8e4aa689f09fcb0729da8037638 03-Apr-2012 Jason Evans <jasone@canonware.com> Add a0malloc(), a0calloc(), and a0free().

Add a0malloc(), a0calloc(), and a0free(), which are used by FreeBSD's
libc to allocate/deallocate TLS in static binaries.
/external/jemalloc/include/jemalloc/internal/tcache.h
ae4c7b4b4092906c641d69b4bf9fcb4a7d50790d 02-Apr-2012 Jason Evans <jasone@canonware.com> Clean up *PAGE* macros.

s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g.

Remove remnants of the dynamic-page-shift code.

Rename the "arenas.pagesize" mallctl to "arenas.page".

Remove the "arenas.chunksize" mallctl, which is redundant with
"opt.lg_chunk".
/external/jemalloc/include/jemalloc/internal/tcache.h
f2296deb57cdda01685f0d0ccf3c6e200378c673 30-Mar-2012 Jason Evans <jasone@canonware.com> Clean up tsd (no functional changes).
/external/jemalloc/include/jemalloc/internal/tcache.h
09a0769ba7a3d139168e606e4295f8002861355f 30-Mar-2012 Jason Evans <jasone@canonware.com> Work around TLS deallocation via free().

glibc uses memalign()/free() to allocate/deallocate TLS, which means
that it is unsafe to set TLS variables as a side effect of free() --
they may already be deallocated. Work around this by avoiding
tcache_create() within free().

Reported by Mike Hommey.
/external/jemalloc/include/jemalloc/internal/tcache.h
d4be8b7b6ee2e21d079180455d4ccbf45cc1cee7 27-Mar-2012 Jason Evans <jasone@canonware.com> Add the "thread.tcache.enabled" mallctl.
/external/jemalloc/include/jemalloc/internal/tcache.h
cd9a1346e96f71bdecdc654ea50fc62d76371e74 22-Mar-2012 Jason Evans <je@fb.com> Implement tsd.

Implement tsd, which is a TLS/TSD abstraction that uses one or both
internally. Modify bootstrapping such that no tsd's are utilized until
allocation is safe.

Remove malloc_[v]tprintf(), and use malloc_snprintf() instead.

Fix %p argument size handling in malloc_vsnprintf().

Fix a long-standing statistics-related bug in the "thread.arena"
mallctl that could cause crashes due to linked list corruption.
/external/jemalloc/include/jemalloc/internal/tcache.h
e24c7af35d1e9d24d02166ac98cfca7cf807ff13 19-Mar-2012 Jason Evans <je@fb.com> Invert NO_TLS to JEMALLOC_TLS.
/external/jemalloc/include/jemalloc/internal/tcache.h
4507f34628dfae26e6b0a6faa13e5f9a49600616 05-Mar-2012 Jason Evans <je@fb.com> Remove the lg_tcache_gc_sweep option.

Remove the lg_tcache_gc_sweep option, because it is no longer
very useful. Prior to the addition of dynamic adjustment of tcache fill
count, it was possible for fill/flush overhead to be a problem, but this
problem no longer occurs.
/external/jemalloc/include/jemalloc/internal/tcache.h
3add8d8cda2993f58fd2eba6efbf4fa12d5c72f3 29-Feb-2012 Jason Evans <je@fb.com> Remove unused variables in tcache_dalloc_large().

Submitted by Mike Hommey.
/external/jemalloc/include/jemalloc/internal/tcache.h
b172610317babc7f365584ddd7fdaf4eb8d9d04c 29-Feb-2012 Jason Evans <je@fb.com> Simplify small size class infrastructure.

Program-generate small size class tables for all valid combinations of
LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate
all relevant data structures, and remove the distinction between
tiny/quantum/cacheline/subpage bins.

Remove --enable-dynamic-page-shift. This option didn't prove useful in
practice, and it prevented optimizations.

Add Tilera architecture support.
/external/jemalloc/include/jemalloc/internal/tcache.h
962463d9b57bcc65de2fa108a691b4183b9b2faf 13-Feb-2012 Jason Evans <je@fb.com> Streamline tcache-related malloc/free fast paths.

tcache_get() is inlined, so do the config_tcache check inside
tcache_get() and simplify its callers.

Make arena_malloc() an inline function, since it is part of the malloc()
fast path.

Remove conditional logic that cause build issues if --disable-tcache was
specified.
/external/jemalloc/include/jemalloc/internal/tcache.h
fd56043c53f1cd1335ae6d1c0ee86cc0fbb9f12e 13-Feb-2012 Jason Evans <je@fb.com> Remove magic.

Remove structure magic, because 1) it is no longer conditional, and 2)
it stopped being very effective at detecting memory corruption several
years ago.
/external/jemalloc/include/jemalloc/internal/tcache.h
7372b15a31c63ac5cb9ed8aeabc2a0a3c005e8bf 11-Feb-2012 Jason Evans <je@fb.com> Reduce cpp conditional logic complexity.

Convert configuration-related cpp conditional logic to use static
constant variables, e.g.:

#ifdef JEMALLOC_DEBUG
[...]
#endif

becomes:

if (config_debug) {
[...]
}

The advantage is clearer, more concise code. The main disadvantage is
that data structures no longer have conditionally defined fields, so
they pay the cost of all fields regardless of whether they are used. In
practice, this is only a minor concern; config_stats will go away in an
upcoming change, and config_prof is the only other major feature that
depends on more than a few special-purpose fields.
/external/jemalloc/include/jemalloc/internal/tcache.h
7427525c28d58c423a68930160e3b0fe577fe953 01-Apr-2011 Jason Evans <jasone@canonware.com> Move repo contents in jemalloc/ to top level.
/external/jemalloc/include/jemalloc/internal/tcache.h