dff860754ebc4a698c8ea257de383ffc398a5a48 |
|
20-May-2015 |
Jason Evans <jasone@canonware.com> |
Impose a minimum tcache count for small size classes. Now that small allocation runs have fewer regions due to run metadata residing in chunk headers, an explicit minimum tcache count is needed to make sure that tcache adequately amortizes synchronization overhead. Bug: 21326736 (cherry picked from commit 83d543f8689bc7c6142179a5491bdf2a31b5cfc7) Change-Id: I4178902b63ed310100019fee0805a11839de740f
/external/jemalloc/include/jemalloc/internal/tcache.h
|
83e5767ee9a8c68150cca06ae0d27a13ba4fcaf8 |
|
22-Apr-2015 |
Christopher Ferris <cferris@google.com> |
Revert "Revert "Merge remote-tracking branch 'aosp/upstream-dev' into merge"" This reverts commit 75929a97332565c3b987986f35652b6d5d275d3c. The original failure this was reverted for seems to have been a bug somewhere else. Change-Id: Ib29ba03b1b967f940dc19eceac2aa1d2923be1eb
/external/jemalloc/include/jemalloc/internal/tcache.h
|
75929a97332565c3b987986f35652b6d5d275d3c |
|
16-Apr-2015 |
Nicolas Geoffray <ngeoffray@google.com> |
Revert "Merge remote-tracking branch 'aosp/upstream-dev' into merge" Revert due to random ART crashes seen. This reverts commit 5b5d97b42e84c2ac417271c3fab6fc282496a335. Change-Id: I62a784301fded7ee853b182d172be46bb32bded7
/external/jemalloc/include/jemalloc/internal/tcache.h
|
5b5d97b42e84c2ac417271c3fab6fc282496a335 |
|
16-Apr-2015 |
Christopher Ferris <cferris@google.com> |
Merge remote-tracking branch 'aosp/upstream-dev' into merge Change-Id: If743a1d002b1793c08a66c0bbd5c2c3eedcebe64
|
41cfe03f39740fe61cf46d86982f66c24168de32 |
|
14-Feb-2015 |
Jason Evans <je@fb.com> |
If MALLOCX_ARENA(a) is specified, use it during tcache fill.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
1cb181ed632e7573fb4eab194e4d216867222d27 |
|
30-Jan-2015 |
Jason Evans <je@fb.com> |
Implement explicit tcache support. Add the MALLOCX_TCACHE() and MALLOCX_TCACHE_NONE macros, which can be used in conjunction with the *allocx() API. Add the tcache.create, tcache.flush, and tcache.destroy mallctls. This resolves #145.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
2c5cb613dfbdf58f88152321b63e60c58cd23972 |
|
08-Dec-2014 |
Guilherme Goncalves <guilherme.p.gonc@gmail.com> |
Introduce two new modes of junk filling: "alloc" and "free". In addition to true/false, opt.junk can now be either "alloc" or "free", giving applications the possibility of junking memory only on allocation or deallocation. This resolves #172.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
e12eaf93dca308a426c182956197b0eeb5f2cff3 |
|
08-Dec-2014 |
Jason Evans <je@fb.com> |
Style and spelling fixes.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
fb795867f0b3aa28bbdf177e1026f3e3408e0338 |
|
14-Nov-2014 |
Christopher Ferris <cferris@google.com> |
Tune the jemalloc to reign in PSS. The tcache in jemalloc can take up quite a bit of extra PSS. Disabling the tcache can save a lot of PSS, but it radically reduces performance. Tune the number of small and large values to store in the tcache. Immediately force any dirty pages to be purged, rather than keep some number of dirty pages around. Restore the chunk size back to 4MB. Using this chunk size and the force dirty page results in a higher cf-bench native mallocs score but about the same amount of PSS use. Limit the number of arenas to 2. The default is 2 * number of cpus, but that increases the amount of PSS used. My benchmarking indicates that more than 2 really doesn't help too much even on a device with 4 cpus. Nearly all speed-ups come from the tcache. Bug: 17498287 Change-Id: I23b23dd88288c90e002a0a04684fb06dbf4ee742
/external/jemalloc/include/jemalloc/internal/tcache.h
|
fc0b3b7383373d66cfed2cd4e2faa272a6868d32 |
|
10-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Add configure options. Add: --with-lg-page --with-lg-page-sizes --with-lg-size-class-group --with-lg-quantum Get rid of STATIC_PAGE_SHIFT, in favor of directly setting LG_PAGE. Fix various edge conditions exposed by the configure options.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
8bb3198f72fc7587dc93527f9f19fb5be52fa553 |
|
08-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Refactor/fix arenas manipulation. Abstract arenas access to use arena_get() (or a0get() where appropriate) rather than directly reading e.g. arenas[ind]. Prior to the addition of the arenas.extend mallctl, the worst possible outcome of directly accessing arenas was a stale read, but arenas.extend may allocate and assign a new array to arenas. Add a tsd-based arenas_cache, which amortizes arenas reads. This introduces some subtle bootstrapping issues, with tsd_boot() now being split into tsd_boot[01]() to support tsd wrapper allocation bootstrapping, as well as an arenas_cache_bypass tsd variable which dynamically terminates allocation of arenas_cache itself. Promote a0malloc(), a0calloc(), and a0free() to be generally useful for internal allocation, and use them in several places (more may be appropriate). Abstract arena->nthreads management and fix a missing decrement during thread destruction (recent tsd refactoring left arenas_cleanup() unused). Change arena_choose() to propagate OOM, and handle OOM in all callers. This is important for providing consistent allocation behavior when the MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible for an OOM to result in allocation silently allocating from a different arena than the one specified.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
155bfa7da18cab0d21d87aa2dce4554166836f5d |
|
06-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Normalize size classes. Normalize size classes to use the same number of size classes per size doubling (currently hard coded to 4), across the intire range of size classes. Small size classes already used this spacing, but in order to support this change, additional small size classes now fill [4 KiB .. 16 KiB). Large size classes range from [16 KiB .. 4 MiB). Huge size classes now support non-multiples of the chunk size in order to fill (4 MiB .. 16 MiB).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
16854ebeb77c9403ebd1b85fdd46ee80bb3f3e9d |
|
05-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Don't disable tcache for lazy-lock. Don't disable tcache when lazy-lock is configured. There already exists a mechanism to disable tcache, but doing so automatically due to lazy-lock causes surprising performance behavior.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
029d44cf8b22aa7b749747bfd585887fb59e0030 |
|
04-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Fix tsd cleanup regressions. Fix tsd cleanup regressions that were introduced in 5460aa6f6676c7f253bfcb75c028dfd38cae8aaf (Convert all tsd variables to reside in a single tsd structure.). These regressions were twofold: 1) tsd_tryget() should never (and need never) return NULL. Rename it to tsd_fetch() and simplify all callers. 2) tsd_*_set() must only be called when tsd is in the nominal state, because cleanup happens during the nominal-->purgatory transition, and re-initialization must not happen while in the purgatory state. Add tsd_nominal() and use it as needed. Note that tsd_*{p,}_get() can still be used as long as no re-initialization that would require cleanup occurs. This means that e.g. the thread_allocated counter can be updated unconditionally.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
551ebc43647521bdd0bc78558b106762b3388928 |
|
03-Oct-2014 |
Jason Evans <jasone@canonware.com> |
Convert to uniform style: cond == false --> !cond
/external/jemalloc/include/jemalloc/internal/tcache.h
|
5460aa6f6676c7f253bfcb75c028dfd38cae8aaf |
|
23-Sep-2014 |
Jason Evans <jasone@canonware.com> |
Convert all tsd variables to reside in a single tsd structure.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
9c640bfdd4e2f25180a32ed3704ce8e4c4cc21f1 |
|
12-Sep-2014 |
Jason Evans <jasone@canonware.com> |
Apply likely()/unlikely() to allocation/deallocation fast paths.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
23fdf8b359a690f457c5300338f4994d06402b95 |
|
09-Sep-2014 |
Daniel Micay <danielmicay@gmail.com> |
mark some conditions as unlikely * assertion failure * malloc_init failure * malloc not already initialized (in malloc_init) * running in valgrind * thread cache disabled at runtime Clang and GCC already consider a comparison with NULL or -1 to be cold, so many branches (out-of-memory) are already correctly considered as cold and marking them is not important.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
3541a904d6fb949f3f0aea05418ccce7cbd4b705 |
|
17-Apr-2014 |
Jason Evans <je@fb.com> |
Refactor small_size2bin and small_bin2size. Refactor small_size2bin and small_bin2size to be inline functions rather than directly accessed arrays.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
3e3caf03af6ca579e473ace4daf25f63102aca4f |
|
17-Apr-2014 |
Jason Evans <jasone@canonware.com> |
Merge pull request #73 from bmaurer/smallmalloc Smaller malloc hot path
|
021136ce4db79f50031a1fd5dd751891888fbc7b |
|
16-Apr-2014 |
Ben Maurer <bmaurer@fb.com> |
Create a const array with only a small bin to size map
/external/jemalloc/include/jemalloc/internal/tcache.h
|
a7619b7fa56f98d1ca99a23b458696dd37c12b77 |
|
15-Apr-2014 |
Ben Maurer <bmaurer@fb.com> |
outline rare tcache_get codepaths
/external/jemalloc/include/jemalloc/internal/tcache.h
|
bd87b01999416ec7418ff8bdb504d9b6c009ff68 |
|
16-Apr-2014 |
Jason Evans <je@fb.com> |
Optimize Valgrind integration. Forcefully disable tcache if running inside Valgrind, and remove Valgrind calls in tcache-specific code. Restructure Valgrind-related code to move most Valgrind calls out of the fast path functions. Take advantage of static knowledge to elide some branches in JEMALLOC_VALGRIND_REALLOC().
/external/jemalloc/include/jemalloc/internal/tcache.h
|
9b0cbf0850b130a9b0a8c58bd10b2926b2083510 |
|
11-Apr-2014 |
Jason Evans <je@fb.com> |
Remove support for non-prof-promote heap profiling metadata. Make promotion of sampled small objects to large objects mandatory, so that profiling metadata can always be stored in the chunk map, rather than requiring one pointer per small region in each small-region page run. In practice the non-prof-promote code was only useful when using jemalloc to track all objects and report them as leaks at program exit. However, Valgrind is at least as good a tool for this particular use case. Furthermore, the non-prof-promote code is getting in the way of some optimizations that will make heap profiling much cheaper for the predominant use case (sampling a small representative proportion of all allocations).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
6e62984ef6ca4312cf0a2e49ea2cc38feb94175b |
|
16-Dec-2013 |
Jason Evans <jasone@canonware.com> |
Don't junk-fill reallocations unless usize changes. Don't junk fill reallocations for which the request size is less than the current usable size, but not enough smaller to cause a size class change. Unlike malloc()/calloc()/realloc(), *allocx() contractually treats the full usize as the allocation, so a caller can ask for zeroed memory via mallocx() and a series of rallocx() calls that all specify MALLOCX_ZERO, and be assured that all newly allocated bytes will be zeroed and made available to the application without danger of allocator mutation until the size class decreases enough to cause usize reduction.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
dda90f59e2b67903668a2799970f64df163e9ccf |
|
20-Oct-2013 |
Jason Evans <jasone@canonware.com> |
Fix a Valgrind integration flaw. Fix a Valgrind integration flaw that caused Valgrind warnings about reads of uninitialized memory in internal zero-initialized data structures (relevant to tcache and prof code).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
06912756cccd0064a9c5c59992dbac1cec68ba3f |
|
01-Feb-2013 |
Jason Evans <je@fb.com> |
Fix Valgrind integration. Fix Valgrind integration to annotate all internally allocated memory in a way that keeps Valgrind happy about internal data structure access.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
88393cb0eb9a046000d20809809d4adac11957ab |
|
22-Jan-2013 |
Jason Evans <jasone@canonware.com> |
Add and use JEMALLOC_ALWAYS_INLINE. Add JEMALLOC_ALWAYS_INLINE and use it to guarantee that the entire fast paths of the primary allocation/deallocation functions are inlined.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
38067483c542adfe092644d1ecc103c6bc74add0 |
|
22-Jan-2013 |
Jason Evans <jasone@canonware.com> |
Tighten valgrind integration. Tighten valgrind integration such that immediately after memory is validated or zeroed, valgrind is told to forget the memory's 'defined' state. The only place newly allocated memory should be left marked as 'defined' is in the public functions (e.g. calloc() and realloc()).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
203484e2ea267e068a68fd2922263f0ff1d5ac6f |
|
02-May-2012 |
Jason Evans <je@fb.com> |
Optimize malloc() and free() fast paths. Embed the bin index for small page runs into the chunk page map, in order to omit [...] in the following dependent load sequence: ptr-->mapelm-->[run-->bin-->]bin_info Move various non-critcal code out of the inlined function chain into helper functions (tcache_event_hard(), arena_dalloc_small(), and locking).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
f7088e6c992d079bc3162e0c48ed4dc5def6d263 |
|
20-Apr-2012 |
Jason Evans <je@fb.com> |
Make arena_salloc() an inline function.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
122449b073bcbaa504c4f592ea2d733503c272d2 |
|
06-Apr-2012 |
Jason Evans <je@fb.com> |
Implement Valgrind support, redzones, and quarantine. Implement Valgrind support, as well as the redzone and quarantine features, which help Valgrind detect memory errors. Redzones are only implemented for small objects because the changes necessary to support redzones around large and huge objects are complicated by in-place reallocation, to the point that it isn't clear that the maintenance burden is worth the incremental improvement to Valgrind support. Merge arena_salloc() and arena_salloc_demote(). Refactor i[v]salloc() to expose the 'demote' option.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
01b3fe55ff3ac8e4aa689f09fcb0729da8037638 |
|
03-Apr-2012 |
Jason Evans <jasone@canonware.com> |
Add a0malloc(), a0calloc(), and a0free(). Add a0malloc(), a0calloc(), and a0free(), which are used by FreeBSD's libc to allocate/deallocate TLS in static binaries.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
ae4c7b4b4092906c641d69b4bf9fcb4a7d50790d |
|
02-Apr-2012 |
Jason Evans <jasone@canonware.com> |
Clean up *PAGE* macros. s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g. Remove remnants of the dynamic-page-shift code. Rename the "arenas.pagesize" mallctl to "arenas.page". Remove the "arenas.chunksize" mallctl, which is redundant with "opt.lg_chunk".
/external/jemalloc/include/jemalloc/internal/tcache.h
|
f2296deb57cdda01685f0d0ccf3c6e200378c673 |
|
30-Mar-2012 |
Jason Evans <jasone@canonware.com> |
Clean up tsd (no functional changes).
/external/jemalloc/include/jemalloc/internal/tcache.h
|
09a0769ba7a3d139168e606e4295f8002861355f |
|
30-Mar-2012 |
Jason Evans <jasone@canonware.com> |
Work around TLS deallocation via free(). glibc uses memalign()/free() to allocate/deallocate TLS, which means that it is unsafe to set TLS variables as a side effect of free() -- they may already be deallocated. Work around this by avoiding tcache_create() within free(). Reported by Mike Hommey.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
d4be8b7b6ee2e21d079180455d4ccbf45cc1cee7 |
|
27-Mar-2012 |
Jason Evans <jasone@canonware.com> |
Add the "thread.tcache.enabled" mallctl.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
cd9a1346e96f71bdecdc654ea50fc62d76371e74 |
|
22-Mar-2012 |
Jason Evans <je@fb.com> |
Implement tsd. Implement tsd, which is a TLS/TSD abstraction that uses one or both internally. Modify bootstrapping such that no tsd's are utilized until allocation is safe. Remove malloc_[v]tprintf(), and use malloc_snprintf() instead. Fix %p argument size handling in malloc_vsnprintf(). Fix a long-standing statistics-related bug in the "thread.arena" mallctl that could cause crashes due to linked list corruption.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
e24c7af35d1e9d24d02166ac98cfca7cf807ff13 |
|
19-Mar-2012 |
Jason Evans <je@fb.com> |
Invert NO_TLS to JEMALLOC_TLS.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
4507f34628dfae26e6b0a6faa13e5f9a49600616 |
|
05-Mar-2012 |
Jason Evans <je@fb.com> |
Remove the lg_tcache_gc_sweep option. Remove the lg_tcache_gc_sweep option, because it is no longer very useful. Prior to the addition of dynamic adjustment of tcache fill count, it was possible for fill/flush overhead to be a problem, but this problem no longer occurs.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
3add8d8cda2993f58fd2eba6efbf4fa12d5c72f3 |
|
29-Feb-2012 |
Jason Evans <je@fb.com> |
Remove unused variables in tcache_dalloc_large(). Submitted by Mike Hommey.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
b172610317babc7f365584ddd7fdaf4eb8d9d04c |
|
29-Feb-2012 |
Jason Evans <je@fb.com> |
Simplify small size class infrastructure. Program-generate small size class tables for all valid combinations of LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate all relevant data structures, and remove the distinction between tiny/quantum/cacheline/subpage bins. Remove --enable-dynamic-page-shift. This option didn't prove useful in practice, and it prevented optimizations. Add Tilera architecture support.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
962463d9b57bcc65de2fa108a691b4183b9b2faf |
|
13-Feb-2012 |
Jason Evans <je@fb.com> |
Streamline tcache-related malloc/free fast paths. tcache_get() is inlined, so do the config_tcache check inside tcache_get() and simplify its callers. Make arena_malloc() an inline function, since it is part of the malloc() fast path. Remove conditional logic that cause build issues if --disable-tcache was specified.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
fd56043c53f1cd1335ae6d1c0ee86cc0fbb9f12e |
|
13-Feb-2012 |
Jason Evans <je@fb.com> |
Remove magic. Remove structure magic, because 1) it is no longer conditional, and 2) it stopped being very effective at detecting memory corruption several years ago.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
7372b15a31c63ac5cb9ed8aeabc2a0a3c005e8bf |
|
11-Feb-2012 |
Jason Evans <je@fb.com> |
Reduce cpp conditional logic complexity. Convert configuration-related cpp conditional logic to use static constant variables, e.g.: #ifdef JEMALLOC_DEBUG [...] #endif becomes: if (config_debug) { [...] } The advantage is clearer, more concise code. The main disadvantage is that data structures no longer have conditionally defined fields, so they pay the cost of all fields regardless of whether they are used. In practice, this is only a minor concern; config_stats will go away in an upcoming change, and config_prof is the only other major feature that depends on more than a few special-purpose fields.
/external/jemalloc/include/jemalloc/internal/tcache.h
|
7427525c28d58c423a68930160e3b0fe577fe953 |
|
01-Apr-2011 |
Jason Evans <jasone@canonware.com> |
Move repo contents in jemalloc/ to top level.
/external/jemalloc/include/jemalloc/internal/tcache.h
|