When I made this code work I will start examine the impact of the design. I suspect that it will, for complex cases
if not compiled be way slower then the compiled part and this is really the hot path. For one thing, we will create 2 closures that
allocates memory at each iteration if not inlined. The best solutionis that wingo can design a very fast compiler targeting
this case in the beginning meaning that guile just handle it perfectly even with potentially 1000:s of cases. Second
posibility is if guiile had a fast compiler that when feeding a lambda to it, it would optimise it. we could simulate this
by simply pass lists representing code and use compile to compile it, but my experience is that it's very time consuming
to do this. I can experiment a little here to see the actual timing.
Anyhow the idea with a fast compiler is that it could prepare in¨the first compiler the setup so that it is really fast to
compile compared to starting from scratch. Here the advice from wingo would be apprisiated.
A final posibility which is not too bad speedwise is to do the following inside the loop and create one big dispatch like
that is executed each iteration.
(let ((val (if (= endian 'little)
(if (= m 4) (get-f32 v1 k1 'little) (get-d64 v1 k1 'little))
(if (= endian 'little)
(if (= m 4) (set-f32 v1 k1 val 'little) (set-d64 v1 k1 val 'little))
This is ideally the code should compile to if it can't create all possible loops
Now I do not like to adjust my code to output this as it makes the framework less powerfulll and useful as every case
will be a special case. But what about if you could mark a code less important. what we want is a dispatch like so
(if (= endian 'little) #:level-2 ...)
And in the first pass, if will be handled if endian is known (will reduce complexity) else it will in the first pass freeze
this one and continue with the whole shebang. the level2 will be the basic compiler, but where the #:level-2 tag is ignored.
Maybe this is a no issue and the compiler handles this gracefully.
Also The compiler could note that endian nbits single? float? etc etc is really created outside the loop and prepare the code for
handling all cases. essentialle make sure to compile all nodes and make an area in the code to modify. then when before the loop
the code can decide which version to use outside the loop (here we can use padding or a goto in case if the padded area is so large
that a goto saves time. this means that the compiler has 33 cases for the ref and 33 cases for the set! part in my most general
version which is ok as they each are typically small. So what I would do if I where the compiler do the following layout pseudo,
(if ... (copy RefStub1 to StubX ...)
(if ... (copy SetStub1 to StubY ...)
(let lp (...)
(let ((val StubX))
(iwhen... (lp ...)))
this can be quite fast.
Self modifying code rocks!!!