guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Summer of Code Recap


From: Ian Price
Subject: Summer of Code Recap
Date: Mon, 28 Aug 2017 19:56:54 +0100

1 Introduction
==============

  As many of you are aware, I have been working on compiling Guile
  Scheme to JavaScript this summer, as part of the Google Summer of
  Code. This post serves to bookend my work for the year.

  Before I go any further, I have to give my thanks to my mentor [Chris
  Webber], without whom this project would have fizzled out weeks ago;
  Google and the Gnu Project, naturally, for providing the Summer of
  Code and allowing me to work on this project; and our fearless leader,
  [Andy Wingo], for answering a wide variety of stupid questions.


[Chris Webber] https://dustycloud.org/

[Andy Wingo] https://wingolog.org/


2 Project Aims
==============

  For a full introduction to the project, you can of course refer back
  to my [project proposal], but very briefly my hopes for this summer
  were:

  1. To rewrite the previous version of my compiler from the [previous
     CPS representation] to use the new representation ["CPS Soup"]
     representation.
  2. To completely port ice-9/boot-9.scm (our basic "prelude") to
     JavaScript, and in particular, to support the [Guile Module
     system].
  3. To handle Proper Tail Calls by use of the [Cheney on the MTA]
     strategy.
  4. To include a new `guild' script for bundling compiled JS files with
     their dependencies.


[project proposal] https://shift-reset.com/static/docs/gsoc-2017.pdf

[previous CPS representation]
https://wingolog.org/archives/2014/01/12/a-continuation-passing-style-intermediate-language-for-guile

["CPS Soup"] https://wingolog.org/archives/2015/07/27/cps-soup

[Guile Module system]
https://www.gnu.org/software/guile/manual/html_node/Modules.html#Modules

[Cheney on the MTA] http://www.pipeline.com/~hbaker1/CheneyMTA.html


3 What was Achieved
===================

  You can find all of my work on the [compile-to-js-2017] branch of my
  Gitlab. A full list of the commits can be found [here], but I will
  summarise the changes now:


[compile-to-js-2017]
https://gitlab.com/ijp/guile/tree/compile-to-js-2017

[here] https://gitlab.com/ijp/guile/compare/1b36a76e...gsoc-2017-end

3.1 Compile Guile CPS Soup to JavaScript
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  When I was working on my initial attempt at compiling Guile to
  JavaScript, two years ago, Guile used a different CPS representation
  as its intermediate language. The initial experiments with the CPS
  Soup representation occurred while that work was ongoing, but as it
  was not considered "stable", the plan was not to move to this
  representation until after I had completed my other objectives.

  Now, however, CPS Soup is the IL of Guile, and so the first task that
  was accomplished was to move to this representation. Since I had
  already created my own JS-IL as a target, I did not need to make any
  changes to the code generation side from JS-IL to JavaScript proper.
  The main change was to reconstruct the nested scope structure that was
  implicit in the dominator structure that Guile made available.

  The full code for the compiler is split into several sections,
  corresponding to different stages in the compiler pipeline.


3.1.1 CPS to JS-IL Compiler
---------------------------

  - module/language/cps/compile-js.scm
  - module/language/cps/spec.scm

  These modules constitute the compiler from CPS to my JS-IL
  intermediate language.


3.1.2 JS-IL to JavaScript Compiler
----------------------------------

  - module/language/js-il.scm
  - module/language/js-il/compile-javascript.scm
  - module/language/js-il/inlining.scm
  - module/language/js-il/spec.scm

  These modules constitute a somewhat ad-hoc intermediate representation
  as a target for the CPS compiler. It differs from JavaScript, e.g., by
  continuing to separate continuations and functions, and a slightly
  specialised function representation to handle Guile's complicated
  notion of procedure arity.


3.1.3 JavaScript Representation
-------------------------------

  - module/language/javascript.scm
  - module/language/javascript/simplify.scm
  - module/language/javascript/spec.scm

  This is primarily the representation of JavaScript as Scheme Records.
  This is separate from the representation of JavaScript Guile already
  has in the form of `(language ecmascript)' primarily to avoid a
  circularity when Guile determines which compilers to run in the
  pipeline, as recommended by Andy Wingo.


3.2 A pre-amble capable of running through boot-9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  In order to run Guile, it is not enough to be able to compile Scheme
  (or indeed any other language supported by Guile) forms to JavaScript,
  we also need to incorporate as much of Guile's runtime as possible.
  This involves implementing VM primitives (such as you might see in
  vm-engine.c); basic Guile types like Symbols, Pairs, and Structs; as
  well as many of the functions that Guile implements in C rather than
  Scheme.

  Although I certainly did not implement all of the functionality Guile
  achieves, I was able to implement sufficiently many (including what
  amounts to a port of much of module.c) that one can successfully run
  though ice-9/boot-9.scm from start to finish.

  This took up the bulk of the time I spent on this project, due to the
  size of the compiled output of boot-9.scm, and my own difficulties
  debugging the bootstrap process. More on this below.

  The code can be found at
  - module/language/js-il/runtime.js


3.3 A linking script for JavaScript
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Since we are using the `(language ...)' infrastructure, we can take
  advantage of the existing `guild compile' script for compiling to
  JavaScript, we simply need to use the `--to' switch. However, this
  does not produce a file which you can just load up without any
  additional work, especially if you are working with multiple modules.

  In order to make it easier to deal with this, I have included a `guild
  jslink' script, which can be used to package up a "main" script along
  with the `runtime.js' and its dependencies. See below for an example.

  The code can be found at
  - module/scripts/jslink.scm


4 What was not Achieved
=======================

4.1 Cheney on the MTA
~~~~~~~~~~~~~~~~~~~~~

  One of my regrets is that I did not implement Baker's "Cheney on the
  MTA" (as seen in [Chicken Scheme]) for handling Proper Tail Calls in
  JavaScript. Historically, JavaScript has not guaranteed that tail
  position function calls do not grow the stack, and this is obviously
  of fundamental importance for languages like Scheme. Fortunately, ES6
  has added support for [proper tail calls] and we can expect to see
  increased support for it in future JavaScript versions. (Indeed,
  during testing on node v.6.10.3, I did not have to increase the stack
  size until very late).


[Chicken Scheme] https://www.call-cc.org/

[proper tail calls]
https://www.ecma-international.org/ecma-262/6.0/#sec-tail-position-calls


5 How to use it
===============

  I've talked a lot about what I've did and didn't do, but what about
  actually using this thing?


5.1 Obtaining the Code
~~~~~~~~~~~~~~~~~~~~~~

  The code is not currently available from the main Guile repository,
  but only the `compile-to-js-2017' branch on my [GitLab].

  If you already have a checkout of guile, you can add my repo as a
  remote with
  ,----
  | $ git remote add ijp https://gitlab.com/ijp/guile.git
  `----
  and fetch the branch with
  ,----
  | $ git fetch ijp
  `----

  You can then check out the `compile-to-js-2017' branch and build as
  normal.


[GitLab] https://gitlab.com/ijp/guile/tree/compile-to-js-2017


5.2 A Non-Trivial Example
~~~~~~~~~~~~~~~~~~~~~~~~~

  As an example of how to use the JS Backend that is short, but
  non-trivial, I am using John McCarthy's `amb' operator (see [A Basis
  for a Mathematical Theory of Computation]) to search for Pythagorean
  Triples.

  First we have a module for the `amb' operator in amb.scm
  ,----
  | (define-module (amb)
  |   #:export (amb fail))
  |
  | (define original-fail
  |   (lambda _
  |     (error 'amb "No more paths to search")))
  |
  | (define *amb-fail* original-fail)
  |
  | (define (fail)
  |   (*amb-fail* #f))
  |
  | (define (amb-thunks . values)
  |   (let ((failure *amb-fail*))
  |     (call/cc (lambda (escape)
  |                (for-each (lambda (value)
  |                            (call/cc (lambda (continue)
  |                                       (set! *amb-fail* continue)
  |                                       (escape (value)))))
  |                          values)
  |                (failure #f)))))
  |
  | (define-syntax amb
  |   (syntax-rules ()
  |     ((amb exprs ...)
  |      (amb-thunks (lambda () exprs) ...))))
  `----

  Next we have the code performs the search in triple.scm
  ,----
  | (use-modules (amb))
  |
  | (let ((a (amb 4 5 6 7 8 9 10))
  |       (b (amb 4 5 6 7 8 9 10))
  |       (c (amb 4 5 6 7 8 9 10)))
  |   (if (= (* c c) (+ (* a a) (* b b)))
  |       (list a b c)
  |       (fail)))
  `----

  We compile the files in the usual manner, only now we specify the
  `javascript' language (We make sure to add the current directory to
  the load-path for triple.scm).

  ,----
  | $ guild compile amb.scm --to=javascript --output=amb.js
  | $ guild compile -L . triple.scm --to=javascript --output=triple.js
  `----

  Next we link the two together into a file main.js, making sure to
  specify amb.js as a dependency of triple.js. (This step will take a
  little while, since it also compiles a bunch of dependencies)

  ,----
  | $ guild jslink triple.js -o main.js --depends="(\"amb\" . \"amb.scm\")"
  `----

  Finally, you can run it with `node', although as mentioned above you
  may have to increase the stack size.

  ,----
  | $ node  --stack-size=2000 main.js
  `----

  Which should, fingers crossed, print out the triple 6,8,10.


[A Basis for a Mathematical Theory of Computation]
http://www-formal.stanford.edu/jmc/basis1.pdf


6 What is next?
===============

  Having recapped what was and what was not achieved, the next question
  is: where does the project go from here? I have been asked about my
  plans for all sorts of features, e.g. support for [Web Assembly], but
  I think the following things are the most important to think about.


[Web Assembly] http://webassembly.org/

6.1 Inclusion into Guile
~~~~~~~~~~~~~~~~~~~~~~~~

  The entire point of the project is to have something that can be
  included in Guile proper. I have not spoken with Guile's maintainers
  about incorporation into the main distribution, but I expect there
  would be not be too many problems with moving the "official branch" to
  the main repository.


6.2 All Guile built-ins in runtime.js
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Although I have included enough to get though boot-9.scm, this does
  not include all of the built-ins we would want in our programs. Two
  things I use very often which do not appear in runtime.js are ports
  and bytevectors.

  We would like most, if not all, Guile built-ins to be available for
  those who need them, so these will need to be implemented. However,
  this is a lot of extra code for some people who don't need it, which
  brings us to a different issue...


6.3 Linking Guile Modules & Features
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  In [a blog post], Andy Wingo lays out many tasks that he would like to
  see in a future Guile. One of the most important of these, for us, are
  under the headings "linking multiple modules together" and "linking a
  single executable". To grossly simplify, we want to be able to link
  various files into one single executable, which contains all and only
  the code we need for our application.

  As it stands, I included a simple script `guild jslink' that bundles
  various compiled JavaScript files into one file, but we would like it
  to be much more featureful: removing modules, functions, even types we
  don't need; and inferring which modules are required by our
  application and bundling them without requiring the information
  `jslink' does. This would allow us to minimise the amount of code that
  needs to be sent over the network, which is very important to web
  developers.

  This is a large task, and one I don't know enough about at the moment
  to attempt, but it is work that would benefit not just our JavaScript
  compiler, but people who want to deploy regular Guile applications.


[a blog post]
https://wingolog.org/archives/2016/02/04/guile-compiler-tasks


6.4 JavaScript Version
~~~~~~~~~~~~~~~~~~~~~~

  I am not an expert in JavaScript, in fact, before this summer I
  probably hadn't written it for two years, which means the code
  certainly does not match up with the current best practices and
  specifications. Further, all of my testing for this compiler was done
  on [Node.js] v.6.10.3 only (this was the version available in the
  Fedora 25 repositories).

  The code should be vetted to determine precisely which modern JS
  features are used (I believe proper tail calls, and ES6 Maps are the
  main ones), and it should be tested on all major browsers. If
  necessary, we should incorporate switches in the compiler to allow JS
  users to compile for particular implementations, taking advantage of
  particular modern JS features, or providing our own implementations of
  those that are not supported (e.g. Cheney on the MTA).


[Node.js] https://nodejs.org/en/


6.5 JS Integration
~~~~~~~~~~~~~~~~~~

  One of the strengths of Guile is that it allows people to integrate
  their Scheme and C code, and although it has not been a focus for this
  summer, we should aim to provide similar levels of integration between
  Scheme and JS. There are two cases to consider.


6.5.1 JS calling Scheme
-----------------------

  As it stands, you can perform some limited interaction from JavaScript
  in a similar manner to how you would interact with Guile from C. For
  instance, by using `scm_current_module', `scm_public_lookup', and the
  `scheme.Symbol' constructor, one could look up a scheme function, e.g.
  `iota', and then invoke it by `scheme.call'.

  That said, C idioms are not JS idioms, and so we should work to
  provide a much nicer API through the `scheme' object.


6.5.2 Scheme calling JS
-----------------------

  In the case of Scheme calling JavaScript, I think we should follow the
  example of `(system foreign)', which provides an API for linking to
  dynamic C libraries, and creating Scheme versions of C functions, and
  automatically marshalling/unmarshalling C types to Scheme types. One
  additional complication we would have with JS would be the presence of
  exceptions, but I think these could also be marshalled into Scheme
  ones without much trouble.


7 Lessons Learned
=================

  It goes without saying that a project like this teaches you a lot
  about the technical design of Guile, how to navigate the codebase,
  etc, but I want to highlight a few "softer" lessons from this summer.


7.1 Compilers are "Easy", Runtimes are Hard
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  When I first set out to write this project two summers ago, I
  naturally assumed that the majority of the effort would go into the
  compiler, and much less into the built-ins. In reality, the effort was
  reversed. Partly this was due to my experience in writing Scheme, and
  Functional Programming more generally, meant that the tree-traversing
  code typical of a compiler pass was relatively straightforward, and
  the compiler was not doing a lot of optimisation, mostly code
  generation.


7.2 Bootstrapping is Hard
~~~~~~~~~~~~~~~~~~~~~~~~~

  The last point leads into this one, bootstrapping is pretty tricky.
  With boot-9, you have several versions of the module system at
  different times. My own attempt to write module code that handled this
  ended up being abandoned for a rewrite that more closely followed the
  Guile C code. The size of the compiled boot-9 code, and the, at times,
  non-local consequences of implementing certain built-ins made it
  tricky to debug.


7.3 Don't Panic
~~~~~~~~~~~~~~~

  This is a much more personal one, and one that I think is very
  important for anyone who wants to take part in a program like the
  Summer of Code, where you are spending a lot of time mostly on your
  own. In a complex software project, things are not always going to go
  smoothly. You might spend weeks banging up against a difficult
  problem. Don't Panic! If it was easy it would have already been done.
  Keep in Contact with your Mentor! It is tempting to only check in when
  you think you have something of progress to report, but they are there
  to help you, and explaining your issues to someone else is often very
  useful when trying to overcome them, even if they don't have an answer
  for you.


8 Wrapping Up
=============

  If you are still with me, good on you. As the new semester is starting
  I will be devoting much less time to this, and that will likely be
  true till December, but I will make an effort to keep up with
  guile-user and be on the IRC Channel to help the daring souls who want
  to give this a go. My priorities will be documenting the ILs, filling
  in missing builtins, and improving jslink. I especially want to see
  basic IO and MiniKanren up and running, and for it to be convenient to
  use Guile's builtin libraries.


Happy Hacking, Ian Price

(This is a crosspost to guile-user of my blogpost [Summer of Code
Recap], but please comment on this list, rather than there)

[Summer of Code Recap]
https://shift-reset.com/blog/2017/8/28/Summer%20of%20Code%20Recap/


reply via email to

[Prev in Thread] Current Thread [Next in Thread]