guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unbound variables used within a form


From: Mark H Weaver
Subject: Re: Unbound variables used within a form
Date: Sat, 13 Oct 2012 01:42:10 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Panicz Maciej Godek <address@hidden> writes:
> I just wrote the biggest function in my life :)
> It takes a scheme program and returns the list
> of all variables (symbols) that are used but
> unbound within that form.

FYI, the standard term for these are "free variables".

> I just wanted to ask:
> - beside lambda*, case-lambda, case-lambda*,
> are there any additional extensions that guile introduced
> atop the Report that should be taken into account in this code?

You also need to handle 'with-fluids'.

> - is there any simpler way to achieve the same effect, ie.
> to acquire all external symbols that are meaningful within
> a form (assuming the core semantics, that is, that all
> the macros were expanded)

First of all, it would be much simpler if you traversed the tree-il
directly, instead of decompiling the tree-il to scheme.  Take a look at
module/language/tree-il.scm and module/language/tree-il/*.scm for
details and examples of how to traverse and analyze tree-il.

In tree-il, you wouldn't have to recognize bound variables and remove
them from the list.  The macro expander already does that job, and
specifically marks each variable reference by its type:
(1) <toplevel-ref> or <toplevel-set> for top-level variables,
(2) <module-ref> or <module-set> for variables in a different module,
(3) <lexical-ref> or <lexical-set> for non-toplevel variables, and
(4) <primitive-ref> for selected primitives.  You could simply scan for
these, discarding the lexicals.

If you still prefer to work with scheme sexps, then there are some
options you can pass to the tree-il->scheme decompiler that will make
your job easier.

(define* (my-expand form #:optional (module (current-module)))
  (decompile (compile form
                      #:from 'scheme #:to 'tree-il
                      #:opts '()
                      #:env module)
             #:from 'tree-il #:to 'scheme
             #:opts '(#:use-derived-syntax? #f
                      #:avoid-lambda? #f)))

By default, the decompiler attempts to reconstruct certain common macros
such as 'cond', 'case', 'and', 'or', 'let*', and named-let, to make the
code more readable by humans.  The "#:use-derived-syntax? #f" option
turns off this behavior, which will simplify code analysis tools.
Further, "#:avoid-lambda? #f" forces every procedure to use 'lambda',
'lambda*', 'case-lambda', or 'case-lambda*', so you won't have to worry
about 'define*', and all 'define' forms will have a simple symbol as the
first operand.

In combination, these two options will allow you to eliminate the
following cases from your code:

>   (match form
>     (((or 'let 'letrec 'letrec*) (bindings ...) body ...)
>      (let-values (((shadowed used) (bound-variables bindings)))
>        (join used (diff (append-map used-variables body) (diff
> shadowed used)))))

If not for "#:use-derived-syntax? #f", you would have had to consider
'let*' here.

>     (('let (? symbol? name) (bindings ...) body ...)
>      (let-values (((shadowed used) (bound-variables bindings)))
>        (join used (diff (append-map used-variables body) (diff
> shadowed used) (list name)))))

You won't need to handle named-let.

>     (('begin body ...)
>      (append-map used-variables body))
>     (((or 'lambda 'lambda*) arg body ...)
>      (cond
>       ((or (pair? arg) (list? arg))
>        (diff (append-map used-variables body) (filter-map
> argument-name (properize arg))))
>       ((symbol? arg)
>        (diff (append-map used-variables body) (list arg)))))
>     (((or 'define 'define*) (name ...) body ...)
>      (diff (append-map used-variables body) name))

You won't need to handle this case above.

>     (('define name value)
>      (diff (used-variables value) name))

This case will only happen at the top-level, and 'name' should not be
removed from the list here.

>     (((or 'case-lambda 'case-lambda*) def ...)
>      (apply join (map (match-lambda ((arg body)
>                                    (cond
>                                     ((symbol? arg)
>                                      (diff (append-map used-variables body) 
> (list arg)))
>                                     ((or (pair? arg) (list? arg))
>                                      (diff (append-map used-variables body)
>                                            (filter-map argument-name 
> (properize arg)))))))
>                     def)))
>     (((or 'if 'or 'and) expr ...)
>      (append-map used-variables expr))

You don't need to handle 'or' or 'and'.

>     (('quote data)
>      '())
>     (('quasiquote data)
>      (letrec ((quasiquote-variables (match-lambda
>                                    (('unquote data) (used-variables data))
>                                    ((data ...) (append-map 
> quasiquote-variables data))
>                                    (else '()))))
>        (quasiquote-variables data)))

You don't need to handle 'quasiquote', which is just a macro.

>     (('@@ name ...)
>      '())

You need to handle both '@@' and '@', and you cannot simply ignore them.

>     ((procedure ...)
>      (append-map used-variables procedure))
>     ((? symbol? variable)
>        (list variable))
>     (else
>      '())))

Having said all this, I think that you are taking very much the wrong
approach to how to save the code in your GUI development system.  Any
programming system needs to work with the *source* code, which means
"the preferred form for making changes".  In Scheme, that means the
source code before *any* compilation has taken place, even macro
expansion.

If you are working with code after macro expansion has taken place, then
you've already lost many of the key advantages of Scheme.  For example,
macro expansion turns a simple record type definition into a relatively
complex set of procedure definitions, which includes internal
implementation details that may change in future versions of Guile.

For another example, take a close look at module/ice-9/psyntax.scm, and
then look at module/ice-9/psyntax-pp.scm, which is the same file after
macro expansion.  Despite my best efforts to make it somewhat readable,
it is most definitely *not* the preferred form for making changes.

I would strongly encourage you to adopt a model where your development
system maintains and manipulates its own copy of the user's source code,
rather than trying to reconstruct the code from Guile's internal data
structures.  That way lies madness.

     Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]