guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Request to add *-resize! functions for contiguous mutable data struc


From: Maxime Devos
Subject: Re: Request to add *-resize! functions for contiguous mutable data structures.
Date: Sat, 07 Aug 2021 13:09:42 +0200
User-agent: Evolution 3.34.2

Vijay Marupudi schreef op vr 06-08-2021 om 09:33 [-0500]:
> Hello!
> 
> I was curious if Guile would be willing to provide a series of
> new procedures for resizing contiguous memory regions.
> 
> (bytevector-resize! <bytevector> new-size [fill])
> (vector-resize! <vector> new-size [fill])
> 
> The [fill] parameter could be used if the new-size is bigger than
> the current size.
>
> This would make writing imperative code easier and more
> performant.

A problem is that this prevents optimisations and can currently
introduce bugs in concurrent code.  Consider the following code:

b.scm:
(use-modules (rnrs bytevectors))

(define (bv-first-two bv)
  (unless (bytevector? bv)
    (error "not a bv"))
  (unless (>= (bytevector-length bv) 2) ; L6
    (error "too small"))
  (values (bytevector-u8-ref bv 0)   ; L8
          (bytevector-u8-ref bv 1))) ; L9
bv-first-two


Compile it with optimisations enabled:

  guild compile b.scm -o b.go -O3 && guild disassemble b.go

(Unfortunately, guile cannot yet compile the bounds check at L8 and L9 away
even though we performed a bounds check at L6 away.)

I can't say I understand the disassembled code very well, but I do note
that the bounds checks (search for (jl ...), (jnl ...) and imm-u64<?,
s64-imm<? and u64<?) are separate from the read of the 'length' of 
bytevector (maybe 'word-ref/immediate' or pointer-ref/immediate) and
the reading of the first and second byte of the bytevector (maybe (u8-ref 5 3 
0),
(u8-ref 2 3 2)).

Now suppose some concurrent thread resizes the bytevector between the bounds
check and the actual reading, then there will be an out-of-bounds access ...

> I acknowledge that it is not idiomatic Scheme to use
> mutable data structures, however this is useful to me for
> dealing with large amounts of text data, in which I need random
> access and flexible data storage. It would allow me to move off
> my custom C extension vector and allow me to use other
> vector-* functions.
> 
> Ideally, this would use libc's `realloc` to make the resize
> quick, so that it can avoid data copying whenever possible.

If you're very careful, you can use 'bytevector->pointer', 'pointer->bytevector'
and (foreign-library-function ... "malloc" ...),
(foreign-library-function ... "realloc" ...),
(foreign-library-function ... "free" ...).

(ice-9 vlist) and <https://github.com/ijp/fectors> might be interesting as well.

Greetings,
Maxime.

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]