bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#14599: An option to make vector allocation aligned


From: Jan Schukat
Subject: bug#14599: An option to make vector allocation aligned
Date: Wed, 12 Jun 2013 23:14:31 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

Thought a bit about it, and it would really be nice to have an aligned uniform vector API.

ATM all are 8 byte aligned, so you probably would want also to be able to have at least 16 and 32 byte alignment (intel's AVX has 256bit registers that better work aligned). But even 64 and and more could be useful for cache line alignment, although that would require this to be a separate alignment, because the benefits of cache line alignment are kind of defeated if the header is in a different cache line.

So I guess just one alignment, namely that of the first element is feasible without wasting whole cache lines. If you really need that you can still use the take_*vector functions, and it's pretty rare to do such things anyway. But being able to control the alignment of the first element allows you to properly use simd instructions on those vectors.

You don't even really need any more space to store alignment information, since that can be directly inferred from the bytevector content pointer, although the bytevector flags still have more than enough space to store it.

Extending the programming api to support this is a bit more tricky. I guess most straightforward and backward compatible would be to just at a set of make-aligned-*vector and aligned-*vector and *->aligned-*vector functions and their scm_* versions with an additional alignment parameter. Optional alignment parameters on the old functions could be nice too, but I guess that is just asking for compatibility trouble.

The other question is the read syntax (one of the primary reasons I'm doing all this). If alignment is something that should be preserved in the permanent representation, you also need to store it in the flags, since the content pointer can be aligned by coincidence. I haven't looked at the compiling of bytevectors yet, to see if alignment can be handled easily there.

As for the text representation, I think the simplest way is to add another reserved character with the alignment number that works for uniform vectors and arrays like #vu8>8(1 2 3 4 5 6) to have the first element at 8byte alignment (right now the allocation pretty much ensures 4 byte alignment of the first element on 32 bit machines and 8 byte at 64bit machines, because gc_malloc returns 8byte aligned blocks, but the array starts at cell word 3. Any 64 bit type vector like double and long is already guaranteed to be misaligned on 32 bit platforms. Which would be even more unfortunate on linux x32 abi systems that uses efficient 64 bit ints with 32 bit pointers, but cell size is determined by pointer size.

Or to construct simd 4 element arrays #2f32:2:4>16((1 2 3 4)(1 2 3 4)). Maybe even have a default alignment of 16 when you just use > without a number so #2f32:2:4>((1 2 3 4)(1 2 3 4)) is the same thing. Or even more convenient #m128((1 2 3 4)(1.0 1.0 1.0 1.0) (2.0 2.0)) where you can freely mix the underlying types and the size of the elements is inferred by the amount of them in each group.


So if there is interest for something like this in the main guile, I will make the patches. If not, I'll just stick to my crude hack for now and see if I need the full shebang :).


Regards

Jan Schukat


On 06/12/2013 04:59 PM, Ludovic Courtès wrote:
severity 14599 wishlist
thanks

Hi!

Jan Schukat <address@hidden> skribis:

If you want to access native uniform vectors from c, sometimes you
really want guarantees about the alignment.
[...]

This isn't necessarily true for vectors created from pre-existing
buffers (the take_*vector functions), but there you have control over
the pointer you pass, so you can make it true if needed.

So if there is interest, maybe this could be integrated into the build
system as a configuration like this:


--- libguile/bytevectors.c    2013-04-11 02:16:30.000000000 +0200
+++ bytevectors.c    2013-06-12 14:45:16.000000000 +0200
@@ -223,10 +223,18 @@

        c_len = len * (scm_i_array_element_type_sizes[element_type] / 8);

+#ifdef SCM_VECTOR_ALIGN
+      contents = scm_gc_malloc_pointerless
(SCM_BYTEVECTOR_HEADER_BYTES + c_len + SCM_VECTOR_ALIGN,
+                        SCM_GC_BYTEVECTOR);
+      ret = PTR2SCM (contents);
+      contents += SCM_BYTEVECTOR_HEADER_BYTES;
+      contents += (addr + (SCM_VECTOR_ALIGN - 1)) & -SCM_VECTOR_ALIGN;
+#else
        contents = scm_gc_malloc_pointerless
(SCM_BYTEVECTOR_HEADER_BYTES + c_len,
                          SCM_GC_BYTEVECTOR);
        ret = PTR2SCM (contents);
        contents += SCM_BYTEVECTOR_HEADER_BYTES;
+#endif

        SCM_BYTEVECTOR_SET_LENGTH (ret, c_len);
        SCM_BYTEVECTOR_SET_CONTENTS (ret, contents);
I don’t think it should be a compile-time option, because it would be
inflexible and inconvenient.

Instead, I would suggest using the scm_take_ functions if allocating
from C, as you noted.

In Scheme, I came up with the following hack:

--8<---------------cut here---------------start------------->8---
(use-modules (system foreign)
              (rnrs bytevectors)
              (ice-9 match))

(define (memalign len alignment)
   (let* ((b (make-bytevector (+ len alignment)))
          (p (bytevector->pointer b))
          (a (pointer-address p)))
     (match (modulo a alignment)
       (0 b)
       (padding
        (let ((p (make-pointer (+ a (- alignment padding)))))
          ;; XXX: Keep a weak reference to B or it can be collected
          ;; behind our back.
          (pointer->bytevector p len))))))
--8<---------------cut here---------------end--------------->8---

Not particularly elegant, but it does the job.  ;-)

Do you think there’s additional support that should be provided?

Thanks,
Ludo’.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]