Discussion:
[Gc] weak maps and libgc
Andy Wingo
2011-10-21 11:31:14 UTC
Permalink
Hello,

I and a few others have written in the past about the difficulties of
implementing weak maps with libgc. Properly implementing weak maps
requires GC integration. Currently I do terrible things with the pre-GC
hook marking a flag that causes a vacuum procedure to run after GC. I
had not thought about the problem of cycles from values to keys in
key-weak maps though:

http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf

To implement this sort of thing, you need tighter GC integration. But
providing those hooks limits the GC. To avoid publically exposing those
hooks, it would be possible to incorporate a weak map or weak set
implementation into libgc itself. What do you think?

Andy
--
http://wingolog.org/
Ivan Maidanski
2011-10-21 14:40:07 UTC
Permalink
Hi Andy,

I haven't implemented weak maps on top of bdwgc directly but I've implemented Java weakness model (Weak/Soft/Phantom/JNI_Weak) based only on this collector API (i.e., using only GC_register_finalizer_no_order (in java-finalization mode), GC_malloc_atomic, GC_general_register_disappearing_link, GC_call_with_alloc_lock and GC_HIDE_POINTER). (Although, I can't say it is efficient as, e.g., I've to allocate and register a proxy object for a weak reference registered in a ReferenceQueue.)

And, taking into account that Java runtime has implementations of weak maps using Java references, I don't see immediately the problems of implementing a weak map (at least in terms of Java) w/o altering the current GC API.

Could you please summarize your past writings about the implementations difficulties?
Thanks.

Regards.
Post by Andy Wingo
Hello,
I and a few others have written in the past about the difficulties of
implementing weak maps with libgc. Properly implementing weak maps
requires GC integration. Currently I do terrible things with the pre-GC
hook marking a flag that causes a vacuum procedure to run after GC. I
had not thought about the problem of cycles from values to keys in
http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf
To implement this sort of thing, you need tighter GC integration. But
providing those hooks limits the GC. To avoid publicly exposing those
hooks, it would be possible to incorporate a weak map or weak set
implementation into libgc itself. What do you think?
Andy
--
http://wingolog.org/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Andy Wingo
2011-10-25 10:07:33 UTC
Permalink
Hi Ivan,

Thanks for the response!
Post by Ivan Maidanski
Could you please summarize your past writings about the
implementations difficulties? Thanks.
Last weekend I rewrote Guile's weak maps and weak sets, and so these
things are fresh in my mind.

1. You can't reliably use the "buckets and chains" strategy to
implement weak hash tables with libgc.

The reason is that with this strategy, creating an entry in the
table allocates memory: typically one spine segment and one rib
segment. Typically you would allocate the rib with a GC kind that
only marks the value if the key has not been nulled out.

However, if the GC collects the key and nulls out one or both parts
of the rib, or even the link from the spine to the rib, you are
still left with at least an extra spine segment.

You can lazily collect spine segments, but that will only happen
when your hash table gets full, relatively speaking. Or you can go
over the whole table and remove dead ribs and their corresponding
spine segments, but this is expensive. However, when would you do
this? The right time to do it is after GC, when disappearing links
are nulled out, but there is no after-GC hook. More on this
later.

Consider a use pattern like:

foo = new_weak_table ();
while (1)
weak_table_set (foo, new_key (), new_value());

If you never go over the whole table to remove links, *there is
practically no upper bound on your memory usage* because you will
keep adding entries to the table, GC happens less and less
frequently relative to the insertion rate because of the rising
heap size (due to spines, ribs, and the keys and values that
haven't yet been collected), and the dead mappings still take up
memory (via the spines and ribs). It is actually worse if you have
the ability to resize tables, for some loads.

I haven't proven that result mathematically but I'm convinced that
weak maps implemented as buckets-and-chains hash tables is a bad
idea with libgc. So the new weak maps I used in Guile are
open-addressed linear-probe tables with robin hood collision
resolution. That solves this dead-entries-occupying-memory issue.

2. Keeping an accurate idea of the load factor of the table is
important to know when to resize the table, either up or down.

However if the GC is constantly removing entries, then your idea of
how many elements are in the table is imprecise. In a
multithreaded program there is no way to know how many elements are
actually in your table. You can know the upper bound but not the
number of those elements that have been removed.

This is important in the workload for (1). Say your load has
reached the point at which you would resize up. But what if you
actually don't need to do so, because there are enough nulled out
entries? Unfortunately you can't know that.

Also it's not a good idea to traverse the table looking for nulled
entries before making the resize decision, and not only because
it's expensive to visit each entry. The real reason is that if a
large portion of your keys are really alive, you could get into
tortoise-and-hare races in which reaping the dead entries in the
table does allow you to move forward, but you end up traversing the
whole table more times than you would if you just resized anyway.

The only way to keep an accurate count in a weak table is to have
integration with the GC. If the GC nulls out an element in a
particular memory region, it could decrement a counter. But there
is no support for this in libgc.

3. The proper time to update the count for a weak table is after GC,
as I have said. But there libgc does not have any notifiers that
GC has happened. It has the start_callback, but you can't
generally do anything there; what I need is a stop_callback that
happens outside the alloc lock, in which I can allocate memory.
Currently with newer GC I use the start_callback to tell Guile to
register a procedure to be called "whenever", which will happen
after GC. With older GC I have to play a trick with finalizers;
it's in the attached code, and is terrible.

4. Moving items in an open-addressed table requires moving
disappearing links. There is no way to do this currently in libgc:
you have to unregister one and register another. This allocates
memory.

5. Accessing elements of a weak table is very expensive, as it has to
happen in the alloc lock. If weak tables were part of GC, it could
make it cheaper to access elements, maybe. But I won't push this,
as I also need custom equality predicates. Currently traversing a
set with an occupancy of about 4000 of 7000 slots, and reshuffling
the table as nulled elements are removed, takes about 100
nanoseconds per element. I'm estimating that based on the total of
about 300 or 500 microseconds to reap the table, which is a lot.
As you can imagine there is a fair amount of variance depending on
the number of items that need to be moved.

6. There is no way around the fact that values in a weak-key map are
strong references, and can thus keep their keys alive
indefinitely. This issue is covered in the ephemeron paper I
linked to, and could also be solved by the GC.

I have attached my new weak set implementation, as it shows most of the
issues. Weak maps are the same except the entry has a value as well,
and for singly-weak tables (key-weak or value-weak, but not
doubly-weak), we have to use a special GC kind to get the marking
right. (Doubly weak tables can just use atomic memory regions.)

Thanks for reading. The implementation follows. It's pretty
straightforward but you'll need a couple of typedefs to get it:

/* Function that returns nonzero if the given object is the one we are
looking for. */
typedef int (*scm_t_set_predicate_fn) (SCM obj, void *closure);

/* Function to fold over the elements of a set. */
typedef SCM (*scm_t_set_fold_fn) (void *closure, SCM key, SCM result);
Andy Wingo
2011-10-25 20:05:53 UTC
Permalink
Hi,
Post by Ivan Maidanski
Could you please summarize your past writings about the
implementations difficulties? Thanks.
More concretely, I would be happy if:

(1) There were a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory. This would
allow me to keep an accurate count of how many items are in a weak
hash table, and remove my need to run something after GC.

(2) There were a function to move disappearing links, perhaps callable
only within the alloc lock.

I did some profiling today with the open-addressed table. It has
to shuffle items around sometimes. Unregistering then registering
disappearing links shows up pretty high on the profile.

(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.

What do you think? I am happy to look at coding up these features, but
I would like your input first.

Andy
--
http://wingolog.org/
Ivan Maidanski
2011-10-26 08:18:44 UTC
Permalink
Hi Andy,
Post by Andy Wingo
Hi,
Post by Ivan Maidanski
Could you please summarize your past writings about the
implementations difficulties? Thanks.
(1) There were a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory. This would
allow me to keep an accurate count of how many items are in a weak
hash table, and remove my need to run something after GC.
You mean statistic counter, right?
Good idea.
Post by Andy Wingo
(2) There were a function to move disappearing links, perhaps callable
only within the alloc lock.
I did some profiling today with the open-addressed table. It has
to shuffle items around sometimes. Unregistering then registering
disappearing links shows up pretty high on the profile.
Is my understanding correct - you want to optimize the following:

// assume GC_general_register_disappearing_link(&link, obj1);
...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);

Right? If yes, I don't mind again.
Post by Andy Wingo
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
Post by Andy Wingo
What do you think? I am happy to look at coding up these features, but
I would like your input first.
Ok. My 'input' is above.

Regards.
Post by Andy Wingo
Andy
--
http://wingolog.org/
Andy Wingo
2011-10-26 09:00:04 UTC
Permalink
Thanks for the thoughts, Ivan!
Post by Ivan Maidanski
(1) a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory
You mean statistic counter, right?
Good idea.
Yes, but ideally one that can be associated with a particular memory
location. So, collecting a weak reference in collection A would
increment a counter associated with A, but not affect a counter
associated with collection B.
Post by Ivan Maidanski
// assume GC_general_register_disappearing_link(&link, obj1);
...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
OK, great. (Embarrassingly, late yesterday I realized that part of my
overhead was caused by terrible flaws in my hash functions, causing
extra collisions and thus reshuffling. It's not often that you get to
replace a hash function and speed up your program by 25%!)
Post by Ivan Maidanski
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
That callback is tricky, because you can't do much in it. But it is
good for starting a timer. Then you need a GC_set_stop_callback to stop
the timer and record the time spent in GC.

Another thing a GC_set_stop_callback would be useful for would be
statistical profiling of the causes of GC -- just sample the call stack
in the stop callback. Such a sampler is likely to allocate memory
though and so it's not appropriate for a start callback.

Regards,

Andy
--
http://wingolog.org/
Ivan Maidanski
2011-10-26 09:36:05 UTC
Permalink
Hi Hans,

What's your opinion of these 3 ideas?

PS. Anyway, it won't go into 7.2 branch.

Regards.
Post by Andy Wingo
Thanks for the thoughts, Ivan!
Post by Ivan Maidanski
(1) a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory
You mean statistic counter, right?
Good idea.
Yes, but ideally one that can be associated with a particular memory
location. So, collecting a weak reference in collection A would
increment a counter associated with A, but not affect a counter
associated with collection B.
Post by Ivan Maidanski
// assume GC_general_register_disappearing_link(&link, obj1);
...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
OK, great. (Embarrassingly, late yesterday I realized that part of my
overhead was caused by terrible flaws in my hash functions, causing
extra collisions and thus reshuffling. It's not often that you get to
replace a hash function and speed up your program by 25%!)
Post by Ivan Maidanski
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
That callback is tricky, because you can't do much in it. But it is
good for starting a timer. Then you need a GC_set_stop_callback to stop
the timer and record the time spent in GC.
Another thing a GC_set_stop_callback would be useful for would be
statistical profiling of the causes of GC -- just sample the call stack
in the stop callback. Such a sampler is likely to allocate memory
though and so it's not appropriate for a start callback.
Regards,
Andy
--
http://wingolog.org/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Boehm, Hans
2011-11-08 01:23:32 UTC
Permalink
Sorry about the belated reply. Again.
-----Original Message-----
Sent: Wednesday, October 26, 2011 2:36 AM
To: Boehm, Hans; Andy Wingo
Cc: gc
Subject: Re[4]: [Gc] weak maps and libgc
Hi Hans,
What's your opinion of these 3 ideas?
PS. Anyway, it won't go into 7.2 branch.
Regards.
Post by Andy Wingo
Thanks for the thoughts, Ivan!
Post by Ivan Maidanski
(1) a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory
You mean statistic counter, right?
Good idea.
Yes, but ideally one that can be associated with a particular memory
location. So, collecting a weak reference in collection A would
increment a counter associated with A, but not affect a counter
associated with collection B.
I'm not sure I understand all the trade-offs here. But isn't this doable with (possibly chained) finalizers already? I don't have a very strong feeling either way, but this doesn't feel like a particularly clean interface addition to me.
Post by Andy Wingo
Post by Ivan Maidanski
// assume GC_general_register_disappearing_link(&link, obj1); ...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
I'm OK with that one, too. It seems a bit esoteric, but it has a clean definition, and there seems to be a clear performance gain. It would be nice if it didn't get pulled in for embedded clients that don't use it and care about code size.
Post by Andy Wingo
OK, great. (Embarrassingly, late yesterday I realized that part of my
overhead was caused by terrible flaws in my hash functions, causing
extra collisions and thus reshuffling. It's not often that you get to
replace a hash function and speed up your program by 25%!)
Post by Ivan Maidanski
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
That callback is tricky, because you can't do much in it. But it is
good for starting a timer. Then you need a GC_set_stop_callback to
stop the timer and record the time spent in GC.
Another thing a GC_set_stop_callback would be useful for would be
statistical profiling of the causes of GC -- just sample the call
stack in the stop callback. Such a sampler is likely to allocate
memory though and so it's not appropriate for a start callback.
I'm OK with this IF you can reasonably define what it means. What does it mean with incremental/generational GC? Is it invoked only at the end of full collections? Can you get two concurrent invocations for different GCs because the first one didn't finish before the second one started? If not, how do you prevent it?

Hans
Post by Andy Wingo
Regards,
Andy
--
http://wingolog.org/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Ivan Maidanski
2011-11-08 07:23:51 UTC
Permalink
Hi Hans,

Regarding GC_move_disappearing_link: I pushed the code to master yesterday (finalize.c).
To care about app code size, I think it's better to move all disappearing-link-related functions to a standalone .c file.
BTW: GC_move_disappearing_link(&link,&link) can be use to test whether link is registered.

Regards.
Post by Boehm, Hans
Sorry about the belated reply. Again.
-----Original Message-----
Sent: Wednesday, October 26, 2011 2:36 AM
To: Boehm, Hans; Andy Wingo
Cc: gc
Subject: Re[4]: [Gc] weak maps and libgc
Hi Hans,
What's your opinion of these 3 ideas?
PS. Anyway, it won't go into 7.2 branch.
Regards.
Post by Andy Wingo
Thanks for the thoughts, Ivan!
Post by Ivan Maidanski
(1) a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory
You mean statistic counter, right?
Good idea.
Yes, but ideally one that can be associated with a particular memory
location. So, collecting a weak reference in collection A would
increment a counter associated with A, but not affect a counter
associated with collection B.
I'm not sure I understand all the trade-offs here. But isn't this doable with (possibly chained) finalizers already? I don't have a very strong feeling either way, but this doesn't feel like a particularly clean interface addition to me.
Post by Andy Wingo
Post by Ivan Maidanski
// assume GC_general_register_disappearing_link(&link, obj1); ...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
I'm OK with that one, too. It seems a bit esoteric, but it has a clean definition, and there seems to be a clear performance gain. It would be nice if it didn't get pulled in for embedded clients that don't use it and care about code size.
Post by Andy Wingo
OK, great. (Embarrassingly, late yesterday I realized that part of my
overhead was caused by terrible flaws in my hash functions, causing
extra collisions and thus reshuffling. It's not often that you get to
replace a hash function and speed up your program by 25%!)
Post by Ivan Maidanski
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
That callback is tricky, because you can't do much in it. But it is
good for starting a timer. Then you need a GC_set_stop_callback to
stop the timer and record the time spent in GC.
Another thing a GC_set_stop_callback would be useful for would be
statistical profiling of the causes of GC -- just sample the call
stack in the stop callback. Such a sampler is likely to allocate
memory though and so it's not appropriate for a start callback.
I'm OK with this IF you can reasonably define what it means. What does it mean with incremental/generational GC? Is it invoked only at the end of full collections? Can you get two concurrent invocations for different GCs because the first one didn't finish before the second one started? If not, how do you prevent it?
Hans
Post by Andy Wingo
Regards,
Andy
--
http://wingolog.org/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Ivan Maidanski
2011-11-09 10:56:23 UTC
Permalink
Hi Hans and Andy,
Post by Ivan Maidanski
Hi Hans,
Regarding GC_move_disappearing_link: I pushed the code to master yesterday (finalize.c).
To care about app code size, I think it's better to move all disappearing-link-related functions to a standalone .c file.
Sorry, I was wrong about this (as disappearing-link-related code is invoked from other parts of GC and moving only its API functions to a standalone file would require to make some global vars non-static).

So, I've added a new macro to prohibit compilation of that function on demand - GC_MOVE_DISAPPEARING_LINK_NOT_NEEDED (similar to JAVA_FINALIZATION_NOT_NEEDED).

Regards.
Post by Ivan Maidanski
BTW: GC_move_disappearing_link(&link,&link) can be use to test whether link is registered.
Regards.
Post by Boehm, Hans
Sorry about the belated reply. Again.
-----Original Message-----
Sent: Wednesday, October 26, 2011 2:36 AM
To: Boehm, Hans; Andy Wingo
Cc: gc
Subject: Re[4]: [Gc] weak maps and libgc
Hi Hans,
What's your opinion of these 3 ideas?
PS. Anyway, it won't go into 7.2 branch.
Regards.
Post by Andy Wingo
Thanks for the thoughts, Ivan!
Post by Ivan Maidanski
(1) a variant of GC_register_disappearing_link that also
incremented an `unsigned long' somewhere in memory
You mean statistic counter, right?
Good idea.
Yes, but ideally one that can be associated with a particular memory
location. So, collecting a weak reference in collection A would
increment a counter associated with A, but not affect a counter
associated with collection B.
I'm not sure I understand all the trade-offs here. But isn't this doable with (possibly chained) finalizers already? I don't have a very strong feeling either way, but this doesn't feel like a particularly clean interface addition to me.
Post by Andy Wingo
Post by Ivan Maidanski
// assume GC_general_register_disappearing_link(&link, obj1); ...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
I'm OK with that one, too. It seems a bit esoteric, but it has a clean definition, and there seems to be a clear performance gain. It would be nice if it didn't get pulled in for embedded clients that don't use it and care about code size.
Post by Andy Wingo
OK, great. (Embarrassingly, late yesterday I realized that part of my
overhead was caused by terrible flaws in my hash functions, causing
extra collisions and thus reshuffling. It's not often that you get to
replace a hash function and speed up your program by 25%!)
Post by Ivan Maidanski
(3) It would be nice if there were a callback after GC. It would let
me accurately measure the time in GC, for example. This isn't
relevant to the current discussion though.
What's about existing GC_set_start_callback?
That callback is tricky, because you can't do much in it. But it is
good for starting a timer. Then you need a GC_set_stop_callback to
stop the timer and record the time spent in GC.
Another thing a GC_set_stop_callback would be useful for would be
statistical profiling of the causes of GC -- just sample the call
stack in the stop callback. Such a sampler is likely to allocate
memory though and so it's not appropriate for a start callback.
I'm OK with this IF you can reasonably define what it means. What does it mean with incremental/generational GC? Is it invoked only at the end of full collections? Can you get two concurrent invocations for different GCs because the first one didn't finish before the second one started? If not, how do you prevent it?
Hans
Post by Andy Wingo
Regards,
Andy
--
http://wingolog.org/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Andy Wingo
2011-10-26 16:18:09 UTC
Permalink
Post by Ivan Maidanski
(2) a function to move disappearing links, perhaps callable only
within the alloc lock.
I did some profiling today with the open-addressed table. It has
to shuffle items around sometimes. Unregistering then registering
disappearing links shows up pretty high on the profile.
// assume GC_general_register_disappearing_link(&link, obj1);
...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
Patch attached. WDYT?

Andy
Ivan Maidanski
2011-11-04 12:58:04 UTC
Permalink
Hi Andy,

Look good to me. I've checked it into my working tree (and commit in several days I hope).
I've done one change to the proposed function - in case dest==src, it now returns success (instead of GC_DUPLICATE).

Also, I've added some tests regarding the function.

Regards.
Post by Andy Wingo
Post by Ivan Maidanski
(2) a function to move disappearing links, perhaps callable only
within the alloc lock.
I did some profiling today with the open-addressed table. It has
to shuffle items around sometimes. Unregistering then registering
disappearing links shows up pretty high on the profile.
// assume GC_general_register_disappearing_link(&link, obj1);
...
GC_unregister_disappearing_link(&link);
GC_general_register_disappearing_link(&link, obj2);
Right? If yes, I don't mind again.
Patch attached. WDYT?
Andy
--
http://wingolog.org/
Ivan Maidanski
2011-10-27 05:18:17 UTC
Permalink
Hi Andy,
Post by Andy Wingo
Hi Ivan,
Thanks for the response!
Post by Ivan Maidanski
Could you please summarize your past writings about the
implementations difficulties? Thanks.
Last weekend I rewrote Guile's weak maps and weak sets, and so these
things are fresh in my mind.
1. You can't reliably use the "buckets and chains" strategy to
implement weak hash tables with libgc.
The reason is that with this strategy, creating an entry in the
table allocates memory: typically one spine segment and one rib
segment. Typically you would allocate the rib with a GC kind that
only marks the value if the key has not been nulled out.
However, if the GC collects the key and nulls out one or both parts
of the rib, or even the link from the spine to the rib, you are
still left with at least an extra spine segment.
You can lazily collect spine segments, but that will only happen
when your hash table gets full, relatively speaking. Or you can go
over the whole table and remove dead ribs and their corresponding
spine segments, but this is expensive. However, when would you do
this? The right time to do it is after GC, when disappearing links
are nulled out, but there is no after-GC hook. More on this
later.
To remove an extra spine segment,
I've added a finalizer to the key which unchains the element (spine segment) no longer referring to key (the field is cleared by GC before running finalizers).
Post by Andy Wingo
foo = new_weak_table ();
while (1)
weak_table_set (foo, new_key (), new_value());
If you never go over the whole table to remove links, *there is
practically no upper bound on your memory usage* because you will
keep adding entries to the table, GC happens less and less
frequently relative to the insertion rate because of the rising
heap size (due to spines, ribs, and the keys and values that
haven't yet been collected), and the dead mappings still take up
memory (via the spines and ribs). It is actually worse if you have
the ability to resize tables, for some loads.
I haven't proven that result mathematically but I'm convinced that
weak maps implemented as buckets-and-chains hash tables is a bad
idea with libgc. So the new weak maps I used in Guile are
open-addressed linear-probe tables with robin hood collision
resolution. That solves this dead-entries-occupying-memory issue.
2. Keeping an accurate idea of the load factor of the table is
important to know when to resize the table, either up or down.
However if the GC is constantly removing entries, then your idea of
how many elements are in the table is imprecise. In a
multithreaded program there is no way to know how many elements are
actually in your table. You can know the upper bound but not the
number of those elements that have been removed.
This is important in the workload for (1). Say your load has
reached the point at which you would resize up. But what if you
actually don't need to do so, because there are enough nulled out
entries? Unfortunately you can't know that.
Also it's not a good idea to traverse the table looking for nulled
entries before making the resize decision, and not only because
it's expensive to visit each entry. The real reason is that if a
large portion of your keys are really alive, you could get into
tortoise-and-hare races in which reaping the dead entries in the
table does allow you to move forward, but you end up traversing the
whole table more times than you would if you just resized anyway.
The only way to keep an accurate count in a weak table is to have
integration with the GC. If the GC nulls out an element in a
particular memory region, it could decrement a counter. But there
is no support for this in libgc.
3. The proper time to update the count for a weak table is after GC,
as I have said. But there libgc does not have any notifiers that
GC has happened. It has the start_callback, but you can't
generally do anything there; what I need is a stop_callback that
happens outside the alloc lock, in which I can allocate memory.
Currently with newer GC I use the start_callback to tell Guile to
register a procedure to be called "whenever", which will happen
after GC. With older GC I have to play a trick with finalizers;
it's in the attached code, and is terrible.
4. Moving items in an open-addressed table requires moving
you have to unregister one and register another. This allocates
memory.
5. Accessing elements of a weak table is very expensive, as it has to
happen in the alloc lock. If weak tables were part of GC, it could
make it cheaper to access elements, maybe. But I won't push this,
as I also need custom equality predicates. Currently traversing a
set with an occupancy of about 4000 of 7000 slots, and reshuffling
the table as nulled elements are removed, takes about 100
nanoseconds per element. I'm estimating that based on the total of
about 300 or 500 microseconds to reap the table, which is a lot.
As you can imagine there is a fair amount of variance depending on
the number of items that need to be moved.
6. There is no way around the fact that values in a weak-key map are
strong references, and can thus keep their keys alive
indefinitely. This issue is covered in the ephemeron paper I
linked to, and could also be solved by the GC.
I don't understand what do you mean in 6.

Regards.
Post by Andy Wingo
I have attached my new weak set implementation, as it shows most of the
issues. Weak maps are the same except the entry has a value as well,
and for singly-weak tables (key-weak or value-weak, but not
doubly-weak), we have to use a special GC kind to get the marking
right. (Doubly weak tables can just use atomic memory regions.)
Thanks for reading. The implementation follows. It's pretty
/* Function that returns nonzero if the given object is the one we are
looking for. */
typedef int (*scm_t_set_predicate_fn) (SCM obj, void *closure);
/* Function to fold over the elements of a set. */
typedef SCM (*scm_t_set_fold_fn) (void *closure, SCM key, SCM result);
Andy
--
http://wingolog.org/
Andy Wingo
2011-10-27 08:08:01 UTC
Permalink
Hi Ivan,
To remove an extra spine segment, I've added a finalizer to the key
which unchains the element (spine segment) no longer referring to key
(the field is cleared by GC before running finalizers).
Is there a way to do that if the user wants to add keys to your weak
table that already have finalizers? I guess you chain the finalizers,
then? OK, thanks; hadn't thought of that.
Post by Andy Wingo
6. There is no way around the fact that values in a weak-key map are
strong references, and can thus keep their keys alive
indefinitely. This issue is covered in the ephemeron paper I
linked to, and could also be solved by the GC.
I don't understand what do you mean in 6.
For a problem description, Java's WeakMap documentation says this:

Implementation note: The value objects in a WeakHashMap are held by
ordinary strong references. Thus care should be taken to ensure that
value objects do not strongly refer to their own keys, either directly
or indirectly, since that will prevent the keys from being
discarded. Note that a value object may refer indirectly to its key
via the WeakHashMap itself; that is, a value object may strongly refer
to some other key object whose associated value object, in turn,
strongly refers to the key of the first value object. One way to deal
with this is to wrap values themselves within WeakReferences before
inserting, as in: m.put(key, new WeakReference(value)), and then
unwrapping upon each get.

http://download.oracle.com/javase/1.4.2/docs/api/java/util/WeakHashMap.html

For a deeper discussion and a discussion of solutions, see this paper:

http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf

Cheers,

Andy
--
http://wingolog.org/
Ivan Maidanski
2011-10-31 15:26:13 UTC
Permalink
Hello Andy,
Post by Andy Wingo
Hi Ivan,
To remove an extra spine segment, I've added a finalizer to the key
which unchains the element (spine segment) no longer referring to key
(the field is cleared by GC before running finalizers).
Is there a way to do that if the user wants to add keys to your weak
table that already have finalizers? I guess you chain the finalizers,
then? OK, thanks; hadn't thought of that.
To say in terms of C code, the client could register finalizer only at object creation and only via mine API, the latter registers mine finalizer which just calls the client one. If, later, the client wants to create some sort of a weak reference which needs an action when the object becomes no longer strong-reachable, then I re-register mine finalizer with a more complex one (this is just an optimization as such case looks to me as rare) passing also some collector-visible data (client_data parameter).
So, I use 2 BDWGC-specific features:
- ability to re-register (and manually unregister) a finalizer;
- ability to pass some client_data pointer (visible to GC).
Post by Andy Wingo
Post by Andy Wingo
6. There is no way around the fact that values in a weak-key map are
strong references, and can thus keep their keys alive
indefinitely. This issue is covered in the ephemeron paper I
linked to, and could also be solved by the GC.
I don't understand what do you mean in 6.
Implementation note: The value objects in a WeakHashMap are held by
ordinary strong references. Thus care should be taken to ensure that
value objects do not strongly refer to their own keys, either directly
or indirectly, since that will prevent the keys from being
discarded. Note that a value object may refer indirectly to its key
via the WeakHashMap itself; that is, a value object may strongly refer
to some other key object whose associated value object, in turn,
strongly refers to the key of the first value object. One way to deal
with this is to wrap values themselves within WeakReferences before
inserting, as in: m.put(key, new WeakReference(value)), and then
unwrapping upon each get.
It does not help me understand where is the problem: WeakHashMap is designed to have strong refs for values, it's not a problem.

Regards.
Post by Andy Wingo
http://download.oracle.com/javase/1.4.2/docs/api/java/util/WeakHashMap.html
http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf
Cheers,
Andy
Andy Wingo
2012-02-23 21:15:45 UTC
Permalink
Hello!

Picking up an old thread, I was working on Guile's weak maps recently.
In the development branch, they are implemented using disappearing
links, and periodic traversal to flush out old entries.

Ivan suggested using finalizers to remove old entries, which is a good
idea. I tried replacing disappearing links with finalizers, but you can
guess the outcome: with custom hash/equality predicates, it would be
possible for lookup to race against removal from finalizers.

So, disappearing links + finalizers seems to be the thing. However, we
have a problem: guardians.

When an object is placed in a guardian, Guile chains a finalizer that
resuscitates the object. (We are using Java finalization.) Then user
code can periodically call the guardian to see if there are collectable
objects. If there are, the guardian returns them, and user code can do
what it wants with the object -- probably cleanup.

The problem comes in when user code needs to associate information with
the object, but the object doesn't have a field for it. In that case,
user code probably uses a weak map. The associations in Guile's weak
maps are documented as persisting past the resuscitation of an object by
a guardian. However, disappearing links are cleared before finalizers
run, so in practice we're losing that information.

It seems that these two things are in conflict: weak map entries
persisting past finalization (but being removed after collection
ultimately happens), versus the desire to see entries disappear from
weak maps "atomically" -- that is, disappearing links are synchronized
with the collector, and the finalizer doesn't affect the semantics of
the map. Whereas if there isn't a disappearing link, there is that
lookup / removal race I mentioned.

I'm not sure what to do. In the end, I think I'm going to have to
expose this tradeoff to my users, and make them choose. But I wish I
didn't have to do that.

Does anyone have ideas about how you could make a weak map whose entries
are present until the object is actually collected?

Regards,

Andy
--
http://wingolog.org/
Hans Aberg
2012-02-23 23:54:44 UTC
Permalink
Post by Andy Wingo
Does anyone have ideas about how you could make a weak map whose entries
are present until the object is actually collected?
Might it be related to the two-heap problem (collected & uncollected) discussed before here on this list?

Hans
Ivan Maidanski
2012-02-24 16:25:32 UTC
Permalink
Hi Andy,

My idea is to use hidden pointer to object as hash map key and register finalizer (probably chaining with the client finalizer) for object which deletes the entry in the map.

PS. I wonder whether it's possible in Java? (PhantomReference keeps the reference to object while it is alive but you can't get that reference.)
Post by Hans Aberg
Post by Andy Wingo
Does anyone have ideas about how you could make a weak map whose entries
are present until the object is actually collected?
Might it be related to the two-heap problem (collected & uncollected) discussed before here on this list?
Hans
_______________________________________________
Gc mailing list
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Andy Wingo
2012-02-24 17:24:17 UTC
Permalink
Hi!
Post by Ivan Maidanski
Post by Andy Wingo
Does anyone have ideas about how you could make a weak map whose entries
are present until the object is actually collected?
My idea is to use hidden pointer to object as hash map key and register
finalizer (probably chaining with the client finalizer) for object which
deletes the entry in the map.
This works if you are looking up entries by the identity of the key --
if the key becomes finalizable, after there are no guardian finalizers
at the head of the finalizer chain, then that entry can never be found.

However, if you can access entries by some aspect of the key, that
doesn't work.

Let me give an example. Guile has a "symbol" data type: basically an
interned string. If two symbols have the same characters, they should
be eq?.

For various reasons, Guile can garbage-collect symbols. So consider:

1. You read the expression (lambda () (define (foo) bar) (foo)).

2. You compile it. There is no `foo' symbol in the compiled
procedure. `foo' is logically collectable now.

3. The GC decides `foo' is collectable and queues a finalizer to
remove it from the symbol table.

4. You read another expression: (lambda () (define (foo) bar) (foo)).

5. In between reading the first `foo' and the second `foo', the
finalizer runs.

6. Now you have two `foo' symbols, with the same characters but
different identities. Oops!

This problem doesn't occur with disappearing links, because they are
synchronized with the collector. Finalizers are not synchronized, so
they can race with the mutator.
Post by Ivan Maidanski
PS. I wonder whether it's possible in Java? (PhantomReference keeps the
reference to object while it is alive but you can't get that reference.)
I don't know :) After having spent an hour or so now looking at this,
I'm not clear on what you would do here. PhantomReference + reference
queues seems to be like guardians, but you can't resuscitate the object.
It seems to me that the entries in an associated weak map would remove
themselves either too early (via a dropped weakref) or too late (via an
asynchronous poll on the refqueue).

Food for thought!

Andy
--
http://wingolog.org/
Petter Urkedal
2011-10-25 17:18:16 UTC
Permalink
Post by Andy Wingo
Hello,
I and a few others have written in the past about the difficulties of
implementing weak maps with libgc. Properly implementing weak maps
requires GC integration. Currently I do terrible things with the pre-GC
hook marking a flag that causes a vacuum procedure to run after GC. I
had not thought about the problem of cycles from values to keys in
http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf
To implement this sort of thing, you need tighter GC integration. But
providing those hooks limits the GC. To avoid publically exposing those
hooks, it would be possible to incorporate a weak map or weak set
implementation into libgc itself. What do you think?
Ivan recently accepted a patch I've been using for some time to
implement hash-consing efficiently. The API is found in "gc_disclaim.h". I
intended it to be general enough to implement other weak data structures, as
well.

The idea is to register a call-back which will be invoked by the
collector before it reclaims an object, allowing it to remove the object
from any data structures where it occurs weakly. The call-back may also tell
the collector not to reclaim the object. This is convenient to avoid
deadlocks (see below), or to inform the collector that new strong
references may have been created since the last collection.

Here is a sketch of how to use the API:

Create a new object kind and GC_register_disclaim_proc with the
call-back. You may want to pass 1 as the last argument (mark_from_all)
to protect any pointers contained in the weak key:

int weakcoll_init(void)
{
weakcoll_kind = GC_new_kind(...);
GC_register_disclaim_proc(weakcoll_kind, weakcoll_disclaim_proc, NULL, 1);
}

You will typically use the first word of the key to store a reference to
the container from which it shall be removed. Due to a technical issue,
the disclaim call-back will currently also be called on freed objects,
in which case a free-link occurs in the first word. The trick is to use
the lower bit of the first word to indicate that the object is live:

int weakcoll_diclaim_proc(void *key, *cd)
{
if (!(*(uintptr_t *)key & 1))
return 0; /* Skip object in free-list. */

weakcoll_t c = (weakcoll_t)(*(uintptr_t *)key & ~(uintptr_t)1);
if (!weakcoll_try_lock(c)) /* Avoid dead-lock! */
return 1; /* Retry on next cycle. */

weakcoll_locked_remove(c, key);
weakcoll_unlock(c);
return 0; /* Successfully released. */
}

For a complete example, I have a hash-consing implementation in
https://github.com/paurkedal/culibs/blob/master/cuoo/halloc_disclaim.c
though it's not very pedagogical, as its cluttered with optimisations, and
somewhat specialized.
Andy Wingo
2011-10-25 19:56:45 UTC
Permalink
Hi Petter,

Thanks for the note!
Post by Petter Urkedal
Ivan recently accepted a patch I've been using for some time to
implement hash-consing efficiently. The API is found in "gc_disclaim.h". I
intended it to be general enough to implement other weak data structures, as
well.
Do you have to allocate the keys in your table to be of a certain kind?
In my case that is not possible, if I understand you correctly, as the
objects might already have another kind of their own.

Andy
--
http://wingolog.org/
Petter Urkedal
2011-10-25 20:28:47 UTC
Permalink
Post by Andy Wingo
Hi Petter,
Thanks for the note!
Post by Petter Urkedal
Ivan recently accepted a patch I've been using for some time to
implement hash-consing efficiently. The API is found in "gc_disclaim.h". I
intended it to be general enough to implement other weak data structures, as
well.
Do you have to allocate the keys in your table to be of a certain kind?
In my case that is not possible, if I understand you correctly, as the
objects might already have another kind of their own.
Yes, or at least you need to store a pointer from your keys to an object
of that kind, and probably back again in order to delete the entry. You
could call GC_register_disclaim_proc on your own kind, but that may not
be feasible performance-wise if it's a generally used kind. That is,
unless you also happen to need finalizers on a large fraction of
objects, in which case the disclaim call-back can be used to call the
finalizers, as well.

I haven't benchmarked precisely, but: GC_register_disclaim_proc should
have a sight performance impact on object ignored by the call-back with
mark-from-all disabled, while I would expect more significant impact
with mark-from-all enabled.
Loading...