David Kastrup
2015-11-08 11:28:14 UTC
The Scheme interpreter Guile has been using the Boehm GC since version
2.0 and that's part of the reason the music typesetter LilyPond has not
yet been ported to Guile 2.
Guile offers user-defined types and in particular allows registering a
deallocation procedure for them via
-- C Function: void scm_set_smob_free (scm_t_bits tc, size_t (*free)
(SCM obj))
This function sets the smob freeing procedure (sometimes referred
to as a "finalizer") for the smob type specified by the tag TC. TC
is the tag returned by ‘scm_make_smob_type’.
The FREE procedure must deallocate all resources that are directly
associated with the smob instance OBJ. It must assume that all
‘SCM’ values that it references have already been freed and are
thus invalid.
It must also not call any libguile function or macro except
‘scm_gc_free’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
Note in particular the second paragraph which clearly states that with
respect to finalization order, there are no guarantees. Now LilyPond
uses STL containers with pointers to other C++ structures with a Scheme
presence a lot. It keeps these structures from being collected by using
a mark procedure registered via
-- C Function: void scm_set_smob_mark (scm_t_bits tc, SCM (*mark) (SCM
obj))
This function sets the smob marking procedure for the smob type
specified by the tag TC. TC is the tag returned by
‘scm_make_smob_type’.
Defining a marking procedure may sometimes be unnecessary because
large parts of the process’ memory (with the exception of
‘scm_gc_malloc_pointerless’ regions, and ‘malloc’- or
‘scm_malloc’-allocated memory) are scanned for live pointers(1).
The MARK procedure must cause ‘scm_gc_mark’ to be called for every
‘SCM’ value that is directly referenced by the smob instance OBJ.
One of these ‘SCM’ values can be returned from the procedure and
Guile will call ‘scm_gc_mark’ for it. This can be used to avoid
deep recursions for smob instances that form a list.
It must not call any libguile function or macro except
‘scm_gc_mark’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
Now the problem we encounter is that if some structure A points to B and
some structure B points to A and both A and B are placed into
finalization, then the finalization of A will delete the associated C++
structure. If now a mark pass is allowed through B, it will try to
access the deleted C++ structure of A.
Topological ordering will not do the trick since cyclical references are
quite typical (a NoteHead has a reference to the corresponding Stem, a
Stem has references to all corresponding NoteHeads).
So what we need here is a setting/hook where queuing some group of
objects for finalization will _stop_ any marking on them (including
false positives of the conservative marking process) before the first
finalization through a type hook stored via scm_set_smob_free on any
member of that group of objects occurs.
Individual objects don't have hooks of their own but they have some
additional bits available where one could record "placed into
finalization queue, don't call its scm_set_smob_mark hook any more"
information.
The problem is how to get this information, or how to get these
semantics from Boehm GC. Any ideas?
2.0 and that's part of the reason the music typesetter LilyPond has not
yet been ported to Guile 2.
Guile offers user-defined types and in particular allows registering a
deallocation procedure for them via
-- C Function: void scm_set_smob_free (scm_t_bits tc, size_t (*free)
(SCM obj))
This function sets the smob freeing procedure (sometimes referred
to as a "finalizer") for the smob type specified by the tag TC. TC
is the tag returned by ‘scm_make_smob_type’.
The FREE procedure must deallocate all resources that are directly
associated with the smob instance OBJ. It must assume that all
‘SCM’ values that it references have already been freed and are
thus invalid.
It must also not call any libguile function or macro except
‘scm_gc_free’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
Note in particular the second paragraph which clearly states that with
respect to finalization order, there are no guarantees. Now LilyPond
uses STL containers with pointers to other C++ structures with a Scheme
presence a lot. It keeps these structures from being collected by using
a mark procedure registered via
-- C Function: void scm_set_smob_mark (scm_t_bits tc, SCM (*mark) (SCM
obj))
This function sets the smob marking procedure for the smob type
specified by the tag TC. TC is the tag returned by
‘scm_make_smob_type’.
Defining a marking procedure may sometimes be unnecessary because
large parts of the process’ memory (with the exception of
‘scm_gc_malloc_pointerless’ regions, and ‘malloc’- or
‘scm_malloc’-allocated memory) are scanned for live pointers(1).
The MARK procedure must cause ‘scm_gc_mark’ to be called for every
‘SCM’ value that is directly referenced by the smob instance OBJ.
One of these ‘SCM’ values can be returned from the procedure and
Guile will call ‘scm_gc_mark’ for it. This can be used to avoid
deep recursions for smob instances that form a list.
It must not call any libguile function or macro except
‘scm_gc_mark’, ‘SCM_SMOB_FLAGS’, ‘SCM_SMOB_DATA’,
‘SCM_SMOB_DATA_2’, and ‘SCM_SMOB_DATA_3’.
Now the problem we encounter is that if some structure A points to B and
some structure B points to A and both A and B are placed into
finalization, then the finalization of A will delete the associated C++
structure. If now a mark pass is allowed through B, it will try to
access the deleted C++ structure of A.
Topological ordering will not do the trick since cyclical references are
quite typical (a NoteHead has a reference to the corresponding Stem, a
Stem has references to all corresponding NoteHeads).
So what we need here is a setting/hook where queuing some group of
objects for finalization will _stop_ any marking on them (including
false positives of the conservative marking process) before the first
finalization through a type hook stored via scm_set_smob_free on any
member of that group of objects occurs.
Individual objects don't have hooks of their own but they have some
additional bits available where one could record "placed into
finalization queue, don't call its scm_set_smob_mark hook any more"
information.
The problem is how to get this information, or how to get these
semantics from Boehm GC. Any ideas?
--
David Kastrup
David Kastrup