Branching now works and a big enough subset of the VM is translatable for some interesting benchmarks to be done.
So by skipping the goto structure a the win is maybe 3-4x for simple numerical loops. I do expect
these loop ta be another factor of 2 when the wip-rtl is translated in the same way. The reason is that the overhead mainly consists of the instructions that move things to and from the cache and rtl seams to decrease the number of such operations. I've been incrementing fixnums and walked
some through lists of size 10000 to measure these numbers.
One thing to note with that code are that it piggy-packs onto the C-stack and is not working with it's own. I bet that is not optimal but that's what I did and it should mean that it's fast to switch to C-code from the
native compiled or jit compiled ones.