Re: more advanced bytevector => supervectors

On 11 Sep 2021, at 19:03, Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:

I did some test ands wingo's superb compiler is about equally fast for a hand made scheme loop as the automatic dispatch for getter and setter. It e.g. can copy from
e.g. u8 to i16 in about 100 op's / second using native byte order. However compiling it in C lead to nasty 2 Go ops / second. So for these kind of patterns
it is still better to work in C as it probaly vectorises the operation quite well. Supervectors supports pushing busy loops to C very well and I will probably
enable fast C code for some simple utility ops.

On Wed, Sep 8, 2021 at 9:18 AM lloda <lloda@sarc.name> wrote:

On 8 Sep 2021, at 04:04, Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:

...

So using get-setter typically means
((get-setter #f bin1 #f
(lambda (set) (set v 2 val)))

#:is-endian 'little ;; only consider little endian setters like I know
#:is-unsigned #t ;; only use unsigned
#:is-integer #t ;; only use integer representations
#:is-fixed #t ;; do not use the scm value vector versions
)
So a version where we only consider handling nonegative integers of up to 64bit. The gain is faster compilation as this ideom will dispatch
between 4 different versions of the the loop lambda and the compiler could inline all of them or be able to detect the one that are used and hot compile that version
(a feature we do not have yet in guile) now whe you select between a ref and a set you will similarly end up with 4*4 versions = 16 different loops that. full versions
is very large and a double loop with all featurs consists of (2*2 + 3*2*2*2 + 4 + 1)**2 = 33*33 ~ 1000 versions of the loop which is crazy if we should expand the loop
for all cases in the compilation. Now guile would just use a functional approach and not expand the loop everywhere. We will have parameterised versions of
libraries so that one can select which versions to compile for. for example the general functions that performs transform form one supervector to another is a general
ideom that would use the full dispatc which is not practical,

I'm curious where you're going with this.

I implemented something similar (iiuc) in https://github.com/lloda/guile-newra/, specifically https://github.com/lloda/guile-newra/blob/master/mod/newra/map.scm , where the lookup/set methods are inlined in the loop. The compilation times indeed grow exponentially so I'm forced to have a default 'generic' case.

The idea for fixing this was to have some kind of run time compilation cache so only a fixed number of type combinations that actually get used would be compiled, instead of the tensor product of all types. But I haven't figured out, or actually tried to do that yet.

Regards

Daniel

From:	lloda
Subject:	Re: more advanced bytevector => supervectors
Date:	Sat, 11 Sep 2021 20:21:27 +0200