[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GC Warning related to large mem block allocation - Help needed
From: |
Daniel Llorens |
Subject: |
Re: GC Warning related to large mem block allocation - Help needed |
Date: |
Mon, 1 Jan 2018 23:05:37 +0100 |
On 01 Jan 2018, at 15:11, Freja Nordsiek <address@hidden> wrote:
> The only worry then would be that it would get collected while still
> being used. I think most cases, this would not be a problem. However, if
> someone makes a new bytevector from an existing one from somewhere in
> the middle, it is possible that the new one would only point to the
> middle and not the head and thus could be collected prematurely (would
> need to do some more digging to see if the new one would be allocated
> using make_bytevector_from_buffer). Or, if someone was using C code to
> say take the norm of the vector (very common operation often done with
> BLAS) and the scheme code wasn't going to use the bytevector anymore,
> there might only be a pointer on the stack pointing to the current
> element that the C code is reading and as soon as it gets past the 512
> byte mark, the bytearray might get collected while it is still being
> worked on which would be a disaster. So I am not sure that the
> allocation could be safely changed to use
> GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE if the bytevector is large. I do not
> know enough about Guile internals yet to know if typical pure scheme
> operations would run into problems. I think it is definitely possible
> that there are FFI cases where problems could be run into, which would
> then mean the coder has to take extra precautions to prevent collection,
> which could be a major problem for changing the allocation Guile 2.0.x
> and 2.2.x since it would be a major API change. Wouldn't be such an
> issue for 3.x series since the API could be changed but it would be a
> bit of a surprising result for people to have to worry about if using
> FFI. I could be wrong on this - a pointer to the head might still be
> kept on the stack and then there is no problem.
Hi Freja,
thanks for these comments. I know too little about GC, but I've dug now and
then in the bytevector / array code. As a user of large arrays, I'm interested
in making them more usable.
The primary means for indirect access to bytevectors / arrays in Guile is the
array API. All array objects contain a reference to their ‘backing store’, and
array handles keep a second copy of this reference, plus a pointer to the head
of the backing store (not to the head of the array itself). The array API uses
a get/release mechanism whose only purpose is to keep those pointers on the
stack. This makes using arrays harder and I've found it annoying, so I've asked
Andy about it a couple times.
AFAIU, the only functions in the public API that can create a vector/bytevector
object from raw memory are the scm_take_xxx series (which are used internally
by pointer->bytevector). Those functions are clearly meant to be used with
‘foreign’ storage, but if one were to use them with Guile-managed storage, then
I think it's understood that the user is responsible for retaining the relevant
pointers.
That's also how I understand the comments above make_bytevector_from_buffer.
Also AFAIU, the only ways to get a pointer to bytevector storage using the
public API are 1) scm_xxx_elements / scm_xxx_writable_elements, 2) the macro
SCM_BYTEVECTOR_CONTENTS, or 3) the FFI function bytevector->pointer.
The scm_xxx_elements functions take an array handle and enforce the get/release
interface, so they should cause no issues.
I think it would be fair to add a warning to the manual that the user is
responsible for retaining the pointer obtained from SCM_BYTEVECTOR_CONTENTS
around any calls. I think that's what the scm_remember_upto_here functions are
for, although I've never had to use them myself. I've also never used
SCM_BYTEVECTOR_CONTENTS directly. I'm not sure it belongs in the public API to
be honest.
Then bytevector->pointer uses SCM_BYTEVECTOR_CONTENTS internally, but it also
keeps (SCM_BYTEVECTOR_CONTENTS(bv) + offset) in a ‘pointer_weak_refs’ table. So
it seems possible for this to happen:
y = make_bytevector
x = bytevector->pointer(y, offset = 512+1)
do_stuff_with(x /* y is collected */)
either in Scheme or in C. But it would be solved if bytevector->pointer kept
just SCM_BYTEVECTOR_CONTENTS(bv) instead of (SCM_BYTEVECTOR_CONTENTS(bv) +
offset) in the table. I'm not sure why it keeps (... + offset). Does this make
sense?
So I'd be interested in trying out GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE, if this
fix to bytevector->pointer makes sense and no one can point out another
*concrete* situation where it would result in a GC bug.
Regards
Daniel