[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: hygiene and macro-introduced toplevel bindings
Re: hygiene and macro-introduced toplevel bindings
Mon, 28 Feb 2011 01:15:45 +0100
Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)
Andy Wingo <address@hidden> writes:
> Hello all,
> Andreas has been struggling with a nonstandard behavior of Guile's
> recently, and we should discuss it more directly.
> The issue is in expressions like this:
> (define-syntax define-accessor
> (syntax-rules ()
> ((_ getter setter init)
> (define val init)
> (define getter (lambda () val))
> (define setter (lambda (x) (set! val x)))
> (define-accessor get-x set-x! 0)
This example serves to illustrate the issue, but I want to make clear
that there are situations where one cannot work around "cleanly" around
this issue -- in the above example, one could use `define-values' to
define `setter' and `getter', and demote `val' into a `let' form inside
the `define-values' expression -- when the `setter' and `getter'
are macros, this is not possible.
> The issue is, what happens when this expression is expanded?
> Within a let or a lambda, it expands to an three internal definitions:
> `val', `getter', and `setter', where `val' is only visible to within the
> `getter' and `setter' procedures.
> At the top level, it expands to three definitions: "val", the getter,
> and the setter. However in this case the "val" binding is global to the
> module, and can be referenced by anyone.
> This is what happens in Guile. I know that some other Schemes do
> different things. Chez, as far as I understand it, binds "val" in the
> module, but under a gensym'd name. It can do this because its modules
> are syntactic: the bindings in a module are not serialized to disk as
> simple symbol-binding pairs, but rather the whole expansion-time ribcage
> is also written out. That's how I understand it anyway; I could be
> getting things wrong there.
> Anyway, in Guile our modules have always been first-class entities. We
> never intern gensym'd names in modules, because who would do that? You
> put a name in a module because you want to be able to name it, either
> internally or externally, and gensym'd names don't make any sense
> without some sort of translation table, and Guile's first-class modules
> have no such table.
Sorry, I don't understand the part about the translation table, could
I agree that it makes no sense to allocate a named binding in the module
for `val', be it under a gensym'ed name, or just as `val'. The first is
bad because of the cost, as you note below, and the latter is bad
(worse, IMO) since it breaks encapsulation of the macro -- consider
(define-accessor (get-foo set-foo! #f))
(define-accessor (get-bar set-bar! #f))
With the current psyntax implementation in Guile, this will lead to two
definitions of `val' inside the same module. Ideally, Guile would
allocate an "anonymous binding" inside the module -- a binding that has
only a location, and lacking a visible name. I have a vague idea how to
pull such a thing of:
During each macro expansion, create a lexical environment, and put all
hygeniencally renamed bindings (such as `val' in the above example) into
that environment. For the above example, this would already be enough;
the closures for `getter' and `setter' would refer to that environment,
and so it's kept alive as long as those don't get undefined/redefined.
If `getter' and `setter' were macros, one would have to put that
environment in a slot in the macro transformer bindings.
I know the above is quite hand-wavy, and I have actually no idea how
difficult such a thing would be to implement, but it might be possible,
even with first-level modules, to avoid the costs of gensym'd top-level
bindings without breaking hygiene/encapsulation. I guess it would even
be advantageous in terms of speed, as I guess lexical environment access
is faster than referring to top-level bindings by name(?).
> Furthermore, gensyms at the top-level have a cost that they do not have
> lexically. When you introduce a lexical binding in a macro and cause a
> new name to be allocated to it, that binding only exists within the
> scope of that form -- if the form is an expression, it exists during the
> dynamic extent of that expression, and if it is a definition, its extent
> is bound to the extent of the binding of some /other/ name---the
> top-level name.
> But when you introduce a generated name to the top-level, you create
> some trash "val-12345543" binding which will always be there, and you
> don't know why. It can never be removed by normal means, because
> top-level bindings are never removed, and its name is invisible to all
> other code -- it has infinite extent.
> And that's the thing that really bothers me about generated top-level
> names: they are only acceptable if you plan on never changing your
> mind. You're not bothered by the trash name, because you'll never
> expand that expression again.
> So! That's my rant. But this is, even more than usual, a case in which
> I could simply be wrong; so if you really want to defend generated
> top-level names, now would be a great time to do so ;-)
Well, I'm not defending generated top-level names but arguing for
preserving encapsulation/hygiene for macros ;-). I wonder what you (and
others) think about my idea as outlined above -- could such a thing
Andreas Rottmann -- <http://rotty.yi.org/>