[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: thoughts on targetting the web

From: Chris Lemmer-Webber
Subject: Re: thoughts on targetting the web
Date: Sat, 19 Jun 2021 18:24:03 -0400
User-agent: mu4e 1.4.15; emacs 27.2

O M G!  So excited to read this email.  Yes!

Andy Wingo writes:

> Hi :)
> A brain-dump tonight.  I was thinking about what it would mean for Guile
> to target the web.  The goal would be for Guile to be a useful language
> for programming client-side interaction on web pages.  Now, there are a
> number of takes for Schemes of various stripes to target JS.  I haven't
> made a full survey but my starting point is that in a Guile context, we
> basically want the Guile Scheme language, which is to say:
>  1. Tail calls.  The solution must include proper tail calls.
>  2. Multiple values.  The solution should have the same semantics that
>     Guile has for multiple-value returns, including implicit
>     truncation.
>  3. Varargs.  The solution should allow for rest arguments, keyword
>     arguments, and the like.
>  4. Delimited continuations.  The solution should allow non-local exit
>     from any function, and should allow for capturing stack slices to
>     partial continuations, and should allow those continuations to be
>     instantiated multiple times if desired.  We should be able to target
>     fibers to the web.  Otherwise, why bother?
>  5. Garbage collection.  *We should re-use the host GC*.  Although it
>     would be possible to manage a heap in linear memory, that has
>     retention problems due to cycles between the Guile heap and the JS
>     heap.

It sounds like from when we spoke not long ago that progress towards the
WASM-GC proposal has been happening I think?  So this seems feasible?

However, I do think that a "proof of concept" GC would still be an
advancement.  Like, even lifting the dead-simple version from SICP as a
stopgap.  We'll all know it's not ideal, but it'll still be progress.
Getting somethign going and showing it is helpful.

That said, I also know how meh such a proposed approach can feel. ;)

>  6. Numbers.  We should support Scheme's unbounded integers.  Requires
>     bignums in the limit case, but thankfully JS has bignums now.  We
>     will still be able to unbox in many cases though.
>  7. Target existing web.  We shouldn't depend on some future WebAssembly
>     or JS proposal -- the solution should work in the here and now and
>     only get better as features are added to the web platform.

Yes.  Though, to jump back again to when you and I talked offline, I
think what I recommended was that as an early proof of concept, a good
idea might be to just have a small number of APIs, or even bet, make a
mini "fantasy console".  Here's an interesting example (and the one we
talked about, off-list):

What's interesting about this first is that it's just a Racket program
that generates quasiquoted s-expression representation WASM.  Now,
obviously your compiler will do more interesting things than that.  But
the second interesting thing is the "fantasy console" component... a set
of custom interfaces to render a simple 8-bit aesthetic console.

My argument for this is: supporting "the existing web" is a big step.
Targeting a much smaller number of APIs, and being able to show the
world a demo that Guile's WASM compiler can do something cool... that's
very motivating, I think.

But an even simpler hello world would be to simply have access to
console.log, and write a scheme program that does ye olde Fibonacci
sequence and spits out the results on the page.  Not as exciting, but
that's the reasonable (and classic scheme) first step I think.

But yes, target the existing web... eventually!  Just don't get so hung
up on worrying about those interfaces that it distracts from making
something that can generate some WASM, is my thoughts.

> From a UX perspective, I would expect we would generally want
> whole-program compilation with aggressive DCE / tree-shaking, producing
> a single JS or WebAssembly artifact at build-time.  But that's a later
> issue.

Yes, focus on things working first... major optimizations can come later!

> I have thought about this off and on over the years but in the end was
> flummoxed about how to meet all requirements.  However recently I think
> I have come up with a solution for most of these:
>  (1) In JS, tail calls are part of ES2015, but not implemented in all
>      browsers.  In WebAssembly, they are a future plan, but hard for
>      various reasons.  So in practice the solution for today's web is to
>      use a universal trampoline and make all calls tail calls --
>      i.e. call all functions via:
>        while true:
>          call(pop())
>      Whenever we can target proper tail calls, this will only get
>      faster.
>  (2) Neither JS nor WebAssembly have the truncation semantics we want.
>      Therefore we will return values via an explicit array, and then the
>      continuation will be responsible for consuming those values and
>      binding any needed variables.
>  (3) Although JS does have varargs support, WebAssembly does not.  But
>      we can actually use the same solution here as we use for (1) and
>      (2) -- pass arguments on an explicit array + nvalues, and relying
>      on the function to parse them appropriately.  In this way we can
>      get Guile keyword arguments.  This also makes the type of Scheme
>      functions uniform, which is important in WebAssembly contexts.
>  (4) This is the interesting bit!  As hinted in (1), we will transform
>      the program such that all calls are tail calls.  This is a form of
>      minimal CPS conversion -- minimal in the sense that functions are
>      broken up into the minimum number of pieces to ensure the
>      all-tail-calls property.  Non-tail calls transform to tail calls by
>      saving their continuation's state on a stack, which is the same as
>      stack allocation for these continuations.  The continuation of the
>      call pops the saved state from the stack.  Because the stack is
>      explicit, we can manipulate it as data: slice it to capture a
>      delimited continuation, drop values to make a non-local exit, splat
>      a saved slice back on to compose a captured delimited continuation
>      with the current continuation, and so on.  Therefore we have the
>      necessary primitives to implement delimited continuations as a
>      library.

I think I understand 15% of this, but the amount I could absorb in the
limited time I spent stewing on it sounds cool anyway.  Neat!  Most
importantly, you have a good idea that you think will work, so yay!

>  (5) Scheme needs a representation that can store any value.  In JS this
>      is easy because JS is untyped.  For WebAssembly, I think I would
>      lean on externref for this purpose, which effectively boxes all
>      values.  There are no fixnums in the current WebAssembly spec, so
>      this is suboptimal, and we have to call out to JS to evaluate type
>      predicates and access fields.  But, support for structured
>      GC-managed types is coming to WebAssembly, which will speed up
>      object access.
>  (6) The easy solution here is to make JS numbers, which are doubles at
>      heart, represent flonums, and use JS bignums for Scheme integers.
>      Fine.
>  (7) This principle has been taken into account in (1)-(6).
> Now, a note on the transformation described in (4), which I call
> "tailification".
> The first step of tailification computes the set of "tails" in a
> function.  The function entry starts a tail, as does each return point
> from non-tail calls.  Join points between different tails also start
> tails.
> In the residual program, there are four ways that a continuation exits:
>   - Tail calls in the source program are tail calls in the residual
>     program; no change.
>   - For non-tail calls in the source program, the caller saves the state
>     of the continuation (the live variables flowing into the
>     continuation) on an explicit stack, and saves the label of the
>     continuation.  The return continuation will be converted into a
>     arity-checking function entry, to handle multi-value returns; when
>     it is invoked, it will pop its incoming live variables from the
>     continuation stack.
>   - Terms that continue to a join continuation are converted to label
>     calls in tail position, passing the state of the continuation as
>     arguments.
>   - Returning values from a continuation pops the return label from the
>     stack and does an indirect tail label call on that label, with the
>     given return values.
> I expect that a tailified program will probably be slower than a
> non-tailified program.  However a tailified program has a few
> interesting properties: the stack is packed and only contains live data;
> the stack can be traversed in a portable way, allowing for
> implementation of prompts on systems that don't support them natively;
> and as all calls are tail calls, the whole system can be implemented
> naturally with a driver trampoline on targets that don't support tail
> calls (e.g. JavaScript and WebAssembly).
> This algorithm is implemented in the wip-tailify branch in Guile's git.
> I have this little driver program:
>     (use-modules (system base compile)
>                  (system base language)
>                  (language cps tailify)
>                  (language cps verify)
>                  (language cps renumber)
>                  (language cps dce)
>                  (language cps simplify)
>                  (language cps dump))
>     (define (tailify* form)
>       (let ((from 'scheme)
>             (make-lower (language-lowerer (lookup-language 'cps)))
>             (optimization-level 2)
>             (warning-level 2)
>             (opts '()))
>         (define cps
>           ((make-lower optimization-level opts)
>            (compile form #:from 'scheme #:to 'cps #:opts opts
>                     #:optimization-level optimization-level
>                     #:warning-level warning-level
>                     #:env (current-module))
>            (current-module)))
>         (format #t "Original:\n")
>         (dump cps)
>         (format #t "\n\nTailified:\n")
>         (let* ((cps (tailify cps))
>                (cps (verify cps))
>                (cps (eliminate-dead-code cps))
>                (cps (simplify cps))
>                (cps (renumber cps)))
>           (dump cps))))
> If I run this on the following lambda:
>   (lambda (x)
>     (pk x
>         (if x
>             (pk 12)
>             (pk 42))))
> Then first it prints the dump of the optimized CPS:
>   Original:
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module 
>     cache-set![0](v1)
>     v2 := const-fun L7                          ; _ 
>     return v2
>   L7:
>     v3 := self
>     L8(...)
>   L8:
>     v4 := receive(x)                            ; x 
>     v5 := cache-ref[(0 . pk)]()                 ; cached 
>     heap-object?(v5) ? L16() : L11()
>   L16():
>     L17(v5)
>   L11():
>     v6 := cache-ref[0]()                        ; mod 
>     v7 := const pk                              ; name 
>     v8 := lookup-bound(v6, v7)                  ; var 
>     cache-set![(0 . pk)](v8)
>     L17(v8)
>   L17(v9):                                      ; box 
>     v10 := scm-ref/immediate[(box . 1)](v9)     ; arg 
>     false?(v4) ? L22() : L19()
>   L22():
>     v12 := const 42                             ; arg 
>     call v10(v12)
>     L25(receive(arg, rest...))
>   L19():
>     v11 := const 12                             ; arg 
>     call v10(v11)
>     L25(receive(arg, rest...))
>   L25(v13, v14):                                ; tmp, tmp 
>     tail call v10(v4, v13)
> Then it prints the tailified version:
>   L0:
>     v0 := self
>     L1(...)
>   L1:
>     receive()
>     v1 := current-module()                      ; module 
>     cache-set![0](v1)
>     v2 := const-fun L8                          ; _ 
>     v3 := restore1[ptr]()                       ; ret 
>     tail calli v3(v2)
>   L8:
>     v4 := self
>     L9(...)
>   L9:
>     v5 := receive(x)                            ; x 
>     v6 := cache-ref[(0 . pk)]()                 ; cached 
>     heap-object?(v6) ? L17() : L12()
>   L17():
>     L18(v6)
>   L12():
>     v7 := cache-ref[0]()                        ; mod 
>     v8 := const pk                              ; name 
>     v9 := lookup-bound(v7, v8)                  ; var 
>     cache-set![(0 . pk)](v9)
>     L18(v9)
>   L18(v10):                                     ; box 
>     v11 := scm-ref/immediate[(box . 1)](v10)    ; arg 
>     false?(v5) ? L24() : L20()
>   L24():
>     v14 := const 42                             ; arg 
>     v15 := code L37                             ; cont 
>     save[(scm scm ptr)](v5, v11, v15)
>     tail call v11(v14)
>   L20():
>     v12 := const 12                             ; arg 
>     v13 := code L29                             ; cont 
>     save[(scm scm ptr)](v5, v11, v13)
>     tail call v11(v12)
>   L29:
>     L30(...)
>   L30:
>     v16, v17 := receive(arg, rest...)           ; tmp, tmp 
>     v18, v19 := restore[(scm scm)]()            ; restored, restored 
>     tail callk L34(_, v18, v19, v16)
>   L34:
>     meta: arg-representations: (scm scm scm)
>     L35(...)
>   L35(v20, v21, v22):                           ; _, _, _ 
>     tail call v21(v20, v22)
>   L37:
>     L38(...)
>   L38:
>     v23, v24 := receive(arg, rest...)           ; tmp, tmp 
>     v25, v26 := restore[(scm scm)]()            ; restored, restored 
>     tail callk L34(_, v25, v26, v23)
> I think I would like to implement the "save" and "restore" primcalls on
> stock Guile, to be able to adequately test tailification.  But then I'd
> look at targetting WebAssembly (just to make it interesting; run-time
> performance would be faster no doubt if I targetted JS, due to GC
> concerns).
> Anyway, thoughts welcome.  Happy hacking :)
> Andy

This is all great.  I think getting something that works well enough to
write some proof-of-concept demos in is the real goal.  If you can make
something that can compile just enough to WASM where me adding the
fantasy console stuff would help (I might just lift wasm-adventure's),
then I think we could make some pretty cool demos for the world.  Maybe
we could even have a special guile-fantasy-console gamejam? ;)

Anyway exciting, exciting, exciting!  My main advice in all of it: don't
let great be the enemy of good-enough, and all that.  Getting to the
point where we're able to do *something* is still useful, if the hooks
are in place to iterate from good-enough to great over time.


 - Chris

reply via email to

[Prev in Thread] Current Thread [Next in Thread]