guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comments on new-model.txt [long]


From: Lynn Winebarger
Subject: Re: comments on new-model.txt [long]
Date: Sat, 28 Sep 2002 18:01:46 -0500

On Tuesday 24 September 2002 15:03, Marius Vollmer wrote:
> Lynn Winebarger <address@hidden> writes:
> 
> > So part of the miscommunication is that I've stopped thinking of
> > "top-level" (lexically open) environments necessarily being at the
> > actual top-level of a lexical environment.
> 
> What is there instead, then?  Or, where would the open environments
> be, if not at the top of lexical environments?

    Well, the clearest example I can think of is in a threaded model. 
Each thread could have its own top-level, _above_ the thread-global
top-level environment.  These would essentially play the role dynamic roots
play now, but hopefully with cleaner (or at least uniform) semantics.  
     You might also think about loading guile compiled executables and
setting up their top-level in a thread of a larger program/process.

> > It complicates define a little, but not much.
> 
> Can you spell out this complication?  I think that will help me to
> understand better.

    Well, it's just _where_ will a define do its business.  Let's say
I am interpreting `(define x 5) in the environment 
`(,toplevel-B ,toplevel-A).   If x isn't bound in either A or B, it's
clear (I think) you want to create a binding for x in B (not A).
Likewise, if x is bound in B, then the define should just act
as a set! on that binding.  Now what happens if x is bound
in A?  Should the define just be a set! on the binding in A, or
create a binding in B?
   In the executable example above, I think it's clear you want
it to create the binding in B.  On the other hand, there are clearly
situations in thread programming where you'd really want it to
set! the binding in A (although it's arguable that you'd really use
set! for that purpose and not define).
   I think if we were compiling an executable, the define would
clearly be hard-coded to create a binding in what would become
B when loaded into a thread.  In modules, of course, define's
should create bindings inside the module, but there might
be a "top-level-define" form that would have to deal with these
worries.

> It would be a win if we can make that connection in a way that doesn't
> rely on module names.
   We can.  As I mentioned before I'm re-implementing syntax-case
in a meta-circular interpreter (but one that resembles Guile's internal
eval, i.e. tree-coded and eventually CPS'ed).
   In this system, all code coming into the evaluator is made into syntax
objects which store the lexical environment in which they were created.
As the code is parsed, the syntax objects are broken down into
new syntax objects for the appropriate pieces; binding forms extend
the lexical environment(s) of their component syntax-objects by the
frame they introduce.  When a syntax object consisting of only a symbol
is seen by the initial parser, it first resolves the identifier in the lexical 
environment
of the syntax-object.  This yields a lexical frame (if it's not bound in the
lexical environment yet, this is the first top-level frame encountered during
the lookup).  If the binding found is to a macro, that macro is applied to
whatever expression is being parsed (i.e. it could be a combination
with the identifier as keyword, _or_ a bare occurence of the identifier),
and the result is reparsed.  Otherwise, a "variable" is created for later
resolution as either a lexical reference (deBrujin indices) or a variable
reference (in the case of a top-level binding).  [I'm putting my current
parse at the bottom of this message].
    Then when the full parser sees the variable refence, it looks up the
environment frame in the lexical environment it's parsing in.  Maybe
I should explain that.  See, forms have to be initially parsed into
just the outer form.  This allows the proper parsing of internal 
define-syntax's,
because you have to find the boundary between definitions in a body
and regular code, and you are not allowed to parse more than once
because macro definitions may cause side-effects (I posted examples
of these before).   So you have to be able to "peek" at whether the
result is a begin, define, define-syntax, or "other" prior to parsing any
sub-forms.  Of course any previously defined macros are applied
during the initial parse (as described above), keeping in mind it is
an error for an internal define(-syntax) to do anything to obscure
the boundary between definitions and the rest of the body. 
    There is a separate procedure I call fully-parse that handles
parsing the subforms.  Fully-parse carries a lexical environment
made in the conventional way (i.e. when you fully parse a lambda,
the new frame is consed onto the existing lexical environment when
you fully-parse the body of the lambda).  When it encounters a "variable"
made by parse, it looks for the resolving frame (found in the lexical
environment stored in the syntax object) in the lexical environment
of fully-parse.  If it can't be found, you have an "identifier out of context"
error that occurs, e.g. in the following:
(let ((x +))
  (let-syntax ((foo (lambda (exp)
                               (syntax-case exp ()
                                   ((_ y ...)
                                    (let ((x -))
                                       (syntax (x y ...))))))))
         (foo 8 9 10)))
    The x will get resolved by the "syntax" keyword to be the one that refers 
to "-".
That frame, however, is not in the lexical environment in which (foo 8 9 10) is
parsed (which contains 2 frames, one with x bound to + and one with foo in
a macro binding).
      Anyway, macros are looked up in the lexical environment in which they were
defined.  Dybvig/Waddell's implementation (I believe) uses opaque symbols where 
I'm
using resolving frames.  I believe that Waddell's implementation of modules in
syntax-case does end up using an explicit lexical frame, but I haven't analyzed 
their
code very closely.  It's pretty clear from writing the code that you can make 
modules
simply by making a blessed class of those identifiers that would be out of 
context 
otherwise.  Only I would suggest that precompiled modules actually have a 
completely
distinct lexical environment embedded in its exported identifiers.  That is, 
the "current"
top-level environment of the compiled module when it was compiled is clearly 
not the 
same as the top-level environment when it gets loaded, and should really be 
different.
The only exceptions being identifiers that the module explicitly imports from 
run-time
environment (via a special form).

> > (define a (load-module "foo"))
> > (define b (load-module "foo"))
> > (import a (prefix "a->"))
> > (import b (prefix "b->"))
> > (set! a->bar 5) 
> > (set! b->bar 7)
> > a->bar => 5
> > b->bar => 7
> 
> How would this work with macros?  'a->switch' might be a macro, but
> the compiler might not be able to figure this out since it can not
> know in general what 'a' refers to.
     As long as you can assume that "foo" refers to the same compiled
module at compile time (on the developer machine) as at run-time
(on the user machine), or some reasonable variant thereof (via some
versioning mechanism), then I think the compiler can know whether
its a macro or not.  The real question is to what extent "foo" depends
on the user's run-time environment (as opposed to the developers 
run-time environment).  It looks as though Flatt has some nice forms
for module loading, but they do not strike me as being the lambda of modules
(but something that should be implementable by macros).

> >   So in this model, you could use modules you object with compiled
> > methods if you like.
> 
> I don't understand.  Come again? ;)
     As I get older, my typing gets worse and worse.  I meant you could
use modules as an object system with compiled methods.

> >    As regards syntax-case and macros/modules: I don't believe Guile
> > should use Dybvig's portable Scheme implementation.
> 
> If we get something better, great!  But this is a separate issue, I'd
> say.  (Larceny has a nice compiler front-end, I think.)
> 
> > Even if it we had a compiler, it does explicitly use symbols for
> > renaming that somehow have to be unique across compiler runs.
> 
> Hmm, the renaming is only done for local variables, not for the global
> ones.  We have a problem right now because the renamed local
> identifiers might clash with the not-renamed globale ones, tho...

   The nice thing about the explicit lexical frames is that you can store
that representation of them, and then reconstruct it when you reload, and
always use eq? to compare them.
    
> > (define foo-macro 
> >   (lambda (x)
> >     (syntax-case x ()
> >       ((_ y ...)
> >        (let ((z +))
> >          (syntax (z y ...)))))))
> 
> Shouldn't that be 'define-syntax' instead of 'define', or am I missing
> something?
    No, I really meant define.  syntax-case and syntax are just macros,
there's no requirement that they appear only inside macros.  One thing
that they do that is special is that syntax-case binds pattern variables
(that have an attached depth) that only syntax can dereference.  Any
non-pattern variable syntax encounters, it just generates a syntax
object with that symbol as the object and captures lexical context in
which the syntax form appears.  
    Think about how this should get computed:
(let-syntax ((foo (begin
                            (pretty-print 1)
                            (lambda (x)
                              (syntax-case x ()
                                  ((_ y ...) 
                                   (let-syntax ((dummy1 (begin
                                                                        
(pretty-print 2)
                                                                        (lambda 
(x) x)))) ;; don't ever actually use this macro!
                                       (syntax (y ...))))))))
   (let-syntax ((bar (begin
                               (pretty-print 3)
                               (lambda (x)
                                 (syntax-case x ()
                                     ((_ y ...) 
                                      (let-syntax ((dummy2 (begin
                                                                           
(pretty-print 4)
                                                                           
(lambda (x) x)))) ;; don't ever actually use this macro!
                                          (syntax (y ...))))))))
        0))
Hint:  You should see
2
1 
4 
3
0

> Could your macro expander be made to work with the new-model?  I.e.,
> assume you need to produce macro-free Scheme plus a few extensions
> like ':module-ref'.  Could it do that?  You can assume that you can
> use real uninterned symbols for local variables (so that there wont be
> any clashes between renamed local variables and global variables).
      It is difficult, I think, because (if I'm not mistaken) the reason the 
current
module system is "interesting" is precisely because variables aren't looked up
until they're actually used (part of the "lazy" macro expansion, except "lazy" 
usually refers to something that has the same outcome whether you were lazy 
or not, except not halting).   Once they're resolved, aren't top-level variable 
occurences replaced with references to bindings in the current module 
environment?  Or do top-level variables get repeatedly looked up?
   That said, it's not impossible.  The main property (other than the lazy 
lookup)
is that modules are implemented by set!-ing the last frame on the lexical stack.
You could do this, I think, and make top-level variable references such that
the evaluator looks them up on their first evaluation (the parser just placing
a "look me up later" tree code varref where the variable reference was), and
then either hard code them in (ala the memoizing macro expansion) or
does the lookup repeatedly.
   That's assuming my understanding of the current module system is 
generally correct.  It would be good to get a better understanding of how
people use the current module system.  Do they actively make use of
the fact there is a "current" module, or 
    I am still working out some bugs.  The meta-interpreter chokes on itself
at the moment.  (but the interpreter manages Jaffer's test.scm using syntax-case
defined cond/case/etc - my datatype macro is just really hairy, and thus a 
strenuous
test of syntax-case).  A lot of the bugs have to do with side effects, 
naturally.  
I can post it on guile-sources when it's reasonably bug free (i.e. when it can 
interpret
itself interpreting test.scm, and possibly one more level of indirection).
     Once I get it to that stage, by the way, I can factor out all the 
anonymous closures
I create into named procedures, then make all primitive operation represented 
as 
data structures.  Once I have that, it would not be a far step to write out the 
fully
parsed code into a file along with lexical information in a reloadable format.  
While 
this isn't particularly great as compilation goes, it would be sufficient to 
play 
around with the distinction between run-time and compile time and their
 interaction with a prototyped module system.   (Although
if this macro expansion algorithm were put into core, writing out preparsed 
code would be wise to do - all the lexical analysis it does isn't as cheap as
the current method).

Lynn
------------------------------
;;;;  $<something> generally constructs a tree-code that does <something>
;;;;   with-type actually interposes the fixed fields of a data stucture into 
the
;;;; environment, so (wrap-literal obj) at the end actually refers to  the obj 
field 
;;;; of the syntax-object lit.

(define check-macro
  (lambda (m exp)
    (define mb #f)
    (define pre-resolved (resolving-frame-of m))
    (resolve-identifier m)
    ;;; this looks for macros in the lexical environment where the operator
    ;;; was created (i.e. possibly inside a macro somewhere).
    (set! mb (lookup-id m (lexical-env-of m)))
    (let ((res (if mb
                   (if (macro-binding? (cdr mb))
                       (let ((mv (get-macro (cdr mb))))
                         (if (uninitialized? mv)
                             ;; this _should_ only occur at the boundary
                             ;; between definitions and expressions inside
                             ;; a body - throw it back undone so body-macro
                             ;; can define the macro it needs.
                             exp
                             ($apply mv ($quote exp))))
                       #f)
                   #f)))
      ;; this is a result of the evil of using side effects  We don't want 
premature resolution
      (if res
          res
         (begin
            (if (not pre-resolved)
           (reset-resolving-frame m))
            #f)))))

;;; note parse just takes a baby step and then returns - this allows
;;; body-builder to do its magic.
(define (parse exp)
  (let ((redo #f))
    (let ((v (call/cc (lambda (k) k))))
      (if (not redo)
          (set! redo v)
          (set! exp v))
      (if (expression? exp)
          exp  ;;; when a primitive macro has gotten it
          (syntax-pattern-case exp ()
            ((rator rands ...)
             (if (id? rator)
                 (let ((r (check-macro rator exp)))
                   (if r
                       (let ((res (interpret r global-env)))
                         (redo res))
                       ($apply-apply rator rands)))
                 ($apply-apply rator rands)))
            (lit
              (if (id? lit)
                  ;;; allow for identifier macros
                  (let ((r (check-macro lit lit)))
                    (if r
                        (let ((res (interpret r global-env)))
                          (redo res))
                        ($variable lit)))
              (with-type syntax-object lit
                  (wrap-literal obj)))))





reply via email to

[Prev in Thread] Current Thread [Next in Thread]