History log of /external/valgrind/helgrind/hg_basics.c
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
328d6627c26471332610da3f5a0b9cc3cdd410c7 25-May-2015 philippe <philippe@a5019735-40e9-0310-863c-91ae7b9d1cf9> This patch decreases significantly the memory needed for OldRef and
slightly increases the performance. It also moderately improves
the nr of cases where helgrind can provide the stack trace of the old
access (when using the same amount of memory for the OldRef entries).
The patch also provides a new helgrind monitor command to show
the recorded accesses for an address+len, and adds an optional argument
lock_address to the monitor command 'info locks', to show the info
about just this lock.

Currently, oldref are maintained in a sparse WA, that points to N
entries, as specified by --conflict-cache-size=N.
For each entry (associated to an address), we have the last 5 accesses.

Old entries are recycled in an exact LRU order.
But inside an entry, we could have a recent access, and 4 very
old accesses that are kept 'alive' by a single thread accessing
repetitively the address shared with the 4 other old entries.


The attached patch replaces the sparse WA that maintains the OldREf
by an hash table.
Each OldRef now also only maintains one single access for an address.
As an OldRef now maintains only one access, all the entries are now
strictly in LRU mode.

Memory used for OldRef
-----------------------
For the trunk, an OldRef has a size of 72 bytes (on 32 bits archs)
maintaining up to 5 accesses to the same address.
On 64 bits arch, an OldRef is 104 bytes.

With the patch, an OldRef has a size of 32 bytes (on 32 bits archs)
or 56 bytes (on 64 bits archs).

So, for one single access, the new code needs (on 32 bits)
32 bytes, while the trunk needs only 14.4 bytes.
However, that is the worst case, assuming that the 5 entries in the
accs array are all used.
Looking on 2 big apps (one of them being firefox), we see that
we have very few OldRef entries that have the 5 entries occupied.
On a firefox startup, of the 5x1,000,000 accesses, we only have
1,406,939 accesses that are used.
So, in average, the trunk uses in reality around 52 bytes per access.

The default value for --conflict-cache-size has been doubled to 2000000.
This ensures that the memory used for the OldRef is more or less the
same as the trunk (104Mb for OldRef entries).

Memory used for sparseWA versus hashtable
-----------------------------------------
Looking on 2 big apps (one of them being firefox), we see that
there are big variations on the size of the WA : it can go in a few
seconds from 10MB to 250MB, or can decrease back to 10 MB.
This all depends where the last N accesses were done: if well localised,
the WA will be small.
If the last N accesses were distributed over a big address space,
then the WA will be big: the last level of WA (the biggest memory consumer)
uses slightly more than 1KB (2KB on 64 bits) for each '256 bytes' memory
zone where there is an oldref. So, in the worst case, on 32 bits, we
need > 1_000_000_000 sparseWA memory to keep 1_000_000 OldRef.

The hash table has between 1 to 2 Word overhead per OldRef
(as the chain array is +- doubled each time the hash table is full).
So, unless the OldRef are extremely localised, the overhead of the
hash table will be significantly less.

With the patch, the core arena total alloc is:
5299535/1201448632 totalloc-blocks/bytes
The trunk is
6693111/3959050280 totalloc-blocks/bytes
(so, around 1.20Gb versus 3.95Gb).
This big difference is due to the fact that the sparseWA repetitively
allocates then frees Level0 or LevelN when OldRef in the region covered
by the Level0/N have all been recycled.

In terms of CPU
---------------
With the patch, on amd64, a firefox startup seems slightly faster (around 1%).
The peak memory mmaped/used decreases by 200Mb.
For a libreoffice test, the memory decreases by 230Mb. CPU also decreases
slightly (1%).


In terms of correctness:
-----------------------
The trunk could potentially show not the most recent access
to the memory of a race : the first OldRef entry matching the raced upon
address was used, while we could have a more recent access in a following
OldRef entry. In other words, the trunk only guaranteed to find the
most recent access in an OldRef, but not between the several OldRef that
could cover the raced upon address.
So, assuming it is important to show the most recent access, this patch
ensures we really show the most recent access, even in presence of overlapping
accesses.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@15289 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
68790a73bcb290746a5b34c44538c3b2728eaaec 13-Sep-2014 florian <florian@a5019735-40e9-0310-863c-91ae7b9d1cf9> VG_(malloc/calloc/strdup) never return NULL (and never will).
So it's pointless to test or assert their return values.
Remove code doing so.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14528 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
0f157ddb404bcde7815a1c5bf2d7e41c114f3d73 18-Oct-2013 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Update copyright dates (20XY-2012 ==> 20XY-2013)


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13658 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
19f91bbaedb4caef8a60ce94b0f507193cc0bc10 10-Nov-2012 florian <florian@a5019735-40e9-0310-863c-91ae7b9d1cf9> Fix more Char/HChar mixups. Closing in...


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13119 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
54fe2021b87b9e5edb8ec8070f47b86d5cafb8aa 28-Oct-2012 florian <florian@a5019735-40e9-0310-863c-91ae7b9d1cf9> Char/HChar and constness fixes. Mostly cost center
on allocators which is always a const HChar *


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@13089 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
03f8d3fc25f5a45c5826259d1b33b7f310117279 05-Aug-2012 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Update copyright dates to include 2012.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12843 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
ec062e8d96a361af9905b5447027819dfbfee01a 23-Oct-2011 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Update all copyright dates, from 20xy-2010 to 20xy-2011.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12206 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
ffce8159a95134f0a2bc1cea3c3e6e265f096d9f 24-Jun-2011 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Merge the contents of the HGDEV2 branch into trunk:
* performance and scalability improvements
* show locks held by both threads in a race
* show all 4 locks involved in a lock order violation
* better delimited error messages



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11824 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
622fe49b55cb60d6132bb100236f591de1515146 11-Mar-2011 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Add free-is-write functionality (experimental, not enabled by default).


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11627 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
9eecbbb9a9cbbd30b903c09a9e04d8efc20bda33 03-May-2010 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Update copyright dates to 2010.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@11121 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
23f1200ba3aa3d8dbb484626ba1bdb7cfcf3b3a9 24-Jul-2009 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Various improvements:

* rename many functions to do with shadow memory handling, to
more clearly differentiate reads and writes directly of the
shadow state from client reads and writes, each of which
generate both a read and a write of the client state. It was
getting confusing (== hard to verify) in there.

* use idempotency of memory state machine transition rules to
speed up long sequential sections, speedups in range 0% to 28%

* remove 4-way Pord (EQ, LT, GT, UN) and associated machinery,
and replace it with something that merely computes LEQ in the
partial ordering, since that's all that is necessary, and
this simplifies some fast-case paths.

* add optional approx history mechanism a la DRD (start/end stack
of conflicting segment), much faster if you don't need exact
conflicting-access details

* libhb_so_recv: tick the VTS in the receiving thread; don't just
join with the VC in the SO. It's probably correct without this
modification, but that correctness is fragile and depends on
complex properties of how SOs are used/created. Much better to
be completely safe. (Needs cache-isation).

* get rid of unnecessary shadow memory state "SVal_NOACCESS"
and simplify associated fast-case paths in msmc{read,write}



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@10589 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
9f207460d70d38c46c9e81996a3dcdf90961c6db 10-Mar-2009 njn <njn@a5019735-40e9-0310-863c-91ae7b9d1cf9> Updated copyright years.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@9344 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
849b0ed71673805c5bdc3e44b1743a3d2c1b513d 21-Dec-2008 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Various changes:

* remove flags --trace-addr= and --trace-level=. These no longer
have any effect, so there's no point in having the associated flags.

* add flag --show-conflicts=no|yes [yes], which makes it possible to
disable the conflicting-access collection machinery. This makes
Helgrind run much faster. Perhaps useful in regression testing,
when it is desired only to find out if a race exists, but not to
collect enough information to easily diagnose it.

* add flag --conflict-cache-size= [1000000], which makes it possible
to control how much memory is used for storage of information about
historical (potentially-conflicting) accesses.

* Update comments on the conflicting-access machinery to more closely
reflect the code. Includes comments on the important aspects of
the value N_OLDREF_ACCS. Increase said constant from 3 to 5.

* Fix bug in event_map_bind: when searching for an OldRef.accs[]
entry that matches the current access, don't forget to also
compare the access sizes. The old code only compared the thread
identity and the read/writeness.

* hg_main.c: disable Dwarf3 variable/type info reading by default.
Mostly this provides little benefit and can cause Helgrind to use
a lot more time and memory at startup.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8845 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c
f98e1c03ce4bea1fb092cdea5571c41f29f6df9b 25-Oct-2008 sewardj <sewardj@a5019735-40e9-0310-863c-91ae7b9d1cf9> Merge Helgrind from branches/YARD into the trunk. Also includes some
minor changes to make stack unwinding on amd64-linux approximately
twice as fast as it was before.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8707 a5019735-40e9-0310-863c-91ae7b9d1cf9
/external/valgrind/helgrind/hg_basics.c