[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: srfi-18 and the vm

From: Andy Wingo
Subject: Re: srfi-18 and the vm
Date: Sun, 31 May 2009 15:30:50 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)

Ni Neil,

On Sun 31 May 2009 01:07, Neil Jerram <address@hidden> writes:

> Andy Wingo <address@hidden> writes:
>> In the short term (within the next year or so), I would imagine that
>> ceval/deval would be faster than an eval written in Scheme -- though I
>> do not know.
> I wasn't thinking of an eval written in Scheme.  I was assuming the
> other option that you mention below, i.e. eval becomes compile on the
> fly followed by VM execution.

Incidentally, the REPL compiles by default now, and has been doing so
since VM was merged to master. But expressions at the REPL are less code
than are full libraries.

>> But compilation does take some time.
> I guess this is the key point.  Even when the compiler has itself been
> compiled?

Much less time in that case, of course. It's the difference between the
load-compiled case and the ceval case below.

I think my understanding of the performance equation goes like this:

  1.8 ceval time =
     (read time) + ((% of executed code) * (memoization time)) + (ceval runtime)

  1.9 ceval time =
     (psyntax expansion time) + (1.8 ceval time)

  1.9 compile time =
     (read time) + (psyntax expansion time) + (compile time)

  1.9 load-compiled time =
     (VM runtime)

  1.9 compile-and-load time =
     (1.9 compile time) + (1.9 load-compiled time)

Let's say that (VM runtime) == 1/4 * (ceval runtime). Then:

  (1.9 compile-and-load time) < (1.9 ceval time)


  (compile time) < (% of executed code) * (memoization time) + 3/4 * (ceval 

Compile-time and memozation time depend linearly on the "complexity" of
the code, as you say.

And my suspicion is that memoization doesn't actually take very much
time, compared to expansion. So a simpler version would be that we win

  (compile time) < 3/4 * (ceval runtime)

Of course there is probably a lower bound to all of this, that we don't
care about differences under 50 ms or so, when operating on non-compiled
code. So if we keep compile time always under 40 ms or so, we're good.

Now how do we stack up, then? Well:

    scheme@(guile-user)> (use-modules (ice-9 time) (system base compile))
    scheme@(guile-user)> (time (compile-file "module/language/tree-il.scm"))
    clock utime stime cutime cstime gctime
     1.17  1.03  0.14   0.00   0.00   0.30
    $1 = "module/language/tree-il.go"
    scheme@(guile-user)> (time (compile-file "module/language/tree-il.scm"))
    clock utime stime cutime cstime gctime
     1.06  0.91  0.14   0.00   0.00   0.16
    $2 = "module/language/tree-il.go"
    scheme@(guile-user)> (time (compile-file 
    clock utime stime cutime cstime gctime
     0.03  0.03  0.00   0.00   0.00   0.01
    $3 = "module/language/tree-il/spec.go"
    scheme@(guile-user)> (time (compile-file 
    clock utime stime cutime cstime gctime
     0.02  0.02  0.00   0.00   0.00   0.00
    $4 = "module/language/tree-il/spec.go"
    scheme@(guile-user)> (time (compile-file 
    clock utime stime cutime cstime gctime
     0.25  0.20  0.05   0.00   0.00   0.04
    $5 = "module/language/assembly/disassemble.go"
    scheme@(guile-user)> (time (compile-file 
    clock utime stime cutime cstime gctime
     0.26  0.23  0.03   0.00   0.00   0.03
    $6 = "module/language/assembly/disassemble.go"
    scheme@(guile-user)> (time (compile-file 
    clock utime stime cutime cstime gctime
     0.53  0.44  0.09   0.00   0.00   0.10
    $7 = "module/language/tree-il/compile-glil.go"
    address@hidden:~/src/guile$ ls -l module/language/tree-il.scm 
module/language/tree-il/spec.scm module/language/assembly/disassemble.scm 
    -rw-rw-r-- 1 wingo wingo  6467 2009-05-29 15:39 
    -rw-rw-r-- 1 wingo wingo 16377 2009-05-29 15:39 
    -rw-rw-r-- 1 wingo wingo 11734 2009-05-29 15:39 module/language/tree-il.scm
    -rw-rw-r-- 1 wingo wingo  1390 2009-05-29 15:39 

This is on my laptop, with the ondemand cpu speed thingie.

It seems expansion itself is taking a bit of this time:

    scheme@(guile-user)> ,m language tree-il
    scheme@(language tree-il)> (use-modules (ice-9 time))
    scheme@(language tree-il)> (time (with-input-from-file 
"module/language/tree-il.scm" (lambda () (let lp ((x (read))) (if (not 
(eof-object? x)) (begin (sc-expand x) (lp (read))))))))
    clock utime stime cutime cstime gctime
     0.44  0.44  0.01   0.00   0.00   0.12
    scheme@(language tree-il)> ,m language assembly disassemble
    scheme@(language assembly disassemble)> (use-modules (ice-9 time))
    scheme@(language assembly disassemble)> (time (with-input-from-file 
"module/language/assembly/disassemble.scm" (lambda () (let lp ((x (read))) (if 
(not (eof-object? x)) (begin (sc-expand x) (lp (read))))))))
    clock utime stime cutime cstime gctime
     0.10  0.09  0.00   0.00   0.00   0.02

Indeed, the expander seems to take most of the time when loading (ice-9
match). I've run all of these in fresh Guiles so one expansion doesn't
have the cost of creating a module:

    scheme@(guile-user)> (time (load "module/ice-9/match.scm"))
    clock utime stime cutime cstime gctime
     1.02  0.98  0.01   0.00   0.00   0.26
    scheme@(guile-user)> (time (with-input-from-file "module/ice-9/match.scm" 
(lambda () (let lp ((x (read))) (if (not (eof-object? x)) (begin (sc-expand x) 
(lp (read))))))))
    clock utime stime cutime cstime gctime
     0.97  0.95  0.01   0.00   0.00   0.25
    scheme@(guile-user)> (time (compile-file "module/ice-9/match.scm"))
    clock utime stime cutime cstime gctime
     3.40  3.01  0.39   0.00   0.00   0.79
    scheme@(guile-user)> (time (load-compiled "module/ice-9/match.go"))
    clock utime stime cutime cstime gctime
     0.00  0.00  0.00   0.00   0.00   0.00


  * The 1.8 memoizer was a big win, because the % of executed code could
    be very small -- loading (ice-9 match) never had to traverse all of
    those nodes.

    OTOH, the memoizer seems to be irrelevant with psyntax run on all of
    the source code, because expansion has to traverse all nodes, so you
    don't get the savings.

  * Currently, expansion seems to take between 30% and 40% of compile
    time. (I imagine we can reduce this absolute time by a factor of 2
    or so with some more optimized compilation of multiple-values cases,
    which should be easy, and the addition of an inliner, which will be
    a bit of work.)

  * Of course, once code is compiled, loading it is *very* fast, and I
    believe it to run at about 3 or 4 times the speed, though I have not
    shown that here.

I guess the big question is, what will the impact be on our users? I
suppose we can divide those users into three categories:

  * People who write short scripts in Guile, using standard libraries.

    These people are likely to see no change, speed-wise. Guile might
    start up faster and the libraries might load faster, but then again,
    perhaps their scripts take 300 ms to expand, which makes that speed
    gain moot.

    We could offer them compilation of their scripts, in the
    compile-and-load sense, which could help if their scripts take a
    long time to run.

  * People who write big programs in Guile.

    These people will likely be irked by the initial slowness with which
    their programs run, but will also likely be receptive to compiling
    their own programs and libraries, which will make them much faster.
    It is likely that these people have butted up against Guile's speed

  * People who use programs written in Guile, but that don't know
    anything about compiling or maybe even Scheme.

    This people are most likely to be adversely affected by all of this.
    Their programs start up more slowly, and for no obvious reason.

I get the feeling that we really should compile libraries by default. We
can put the results in ~/.guile-something if we don't have permissions
to put them alongside the .scm files -- that allows for sharing in
multiuser installations, but doesn't require it.

This way we offer more advantages to our users. But I don't know. What
do yall think?

>> Incidentally, ikarus had a similar discussion recently:
> I see; good references - there's no need for us to have the same
> conversation again!  I think I agree with Aziz's conclusion -
> i.e. "auto-caching" should be disabled by default.

I think I might have convinced myself otherwise.

> The thread suggested to me that Ikarus has to compile the code that it
> reads before it can execute it; i.e. that it doesn't retain an
> interpretation option.  Is that correct?


> If so, Guile might reasonably make slightly different decisions - e.g.
> to interpret a module when using it for the first time, and also to
> start compiling it on another thread, with the module's procedures
> being replaced one-by-one by VM programs as the compilation
> progresses.

Interesting idea :)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]