[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] add language/wisp to Guile?

From: Philip McGrath
Subject: Re: [PATCH] add language/wisp to Guile?
Date: Mon, 27 Feb 2023 23:27:39 -0500


On Monday, February 27, 2023 2:26:47 AM EST Marc Nieper-Wißkirchen wrote:
> Am Mo., 27. Feb. 2023 um 00:22 Uhr schrieb Philip McGrath
> <>:
> > Hi,
> > 
> > On Sunday, February 26, 2023 6:02:04 AM EST Marc Nieper-Wißkirchen wrote:
> > > Am So., 26. Feb. 2023 um 08:46 Uhr schrieb <guile-devel->:
> > > > Message: 1
> > > > Date: Sun, 26 Feb 2023 02:45:12 -0500
> > > > From: "Philip McGrath" <>
> > > > To: "Maxime Devos" <>, Ludovic Courtès
> > > > 
> > > >         <>, "Matt Wette" <>,
> > > >
> > > > 
> > > > Cc: "Christine Lemmer-Webber" <>
> > > > Subject: Re: [PATCH] add language/wisp to Guile?
> > > > Message-ID: <>
> > > > Content-Type: text/plain;charset=utf-8
> > > 
> > > [...]
> > > 
> > > I would like to make two remarks, which I think are essential to get
> > > the semantics right.
> > > 
> > > The R6RS comments of the form "#!r6rs" are defined to modify the
> > > lexical syntax of the reader; possibly, they don't change the language
> > > semantics (after reading).  In particular, "#!r6rs" also applies to
> > > data files but does not affect the interpretation of the data after it
> > > is read. It cannot because the reader otherwise ignores and does not
> > > report comments.
> > > 
> > > Thus a comment of the form "#!r6rs" may be suitable for Wisp, but it
> > > is not a substitute for Racket's "#lang" (or a similar mechanism).
> > > Guile shouldn't confuse these two different levels of meaning.
> > 
> > I agree that it's important to distinguish between lexical syntax (`read`)
> > and the semantics of what is read.
> > 
> > However, Racket's `#lang` in fact operates entirely at the level of
> > `read`.
> > (Racketeers contribute to confusion on this point by using `#lang` as a
> > shorthand for Racket's entire language-creation infrastructure, when in
> > fact `#lang` specifically has a fairly small, though important, role.)
> > When `read` encounters `#lang something`, it looks up a reader extension
> > procedure in the module indicated by `something` and uses that procedure
> > to continue parsing the input stream into data. Importantly, while syntax
> > objects may be used to attach source location information, there is no
> > "lexical context" or binding information at this stage, as one familiar
> > with syntax objects from macro writing might expect: those semantics come
> > after `read` has finished parsing the input stream from bytes to values.
> [...]
> Thank you for the reminder on Racket's #lang mechanism; it is a long
> time ago since I wrote some #lang extensions myself when experimenting
> with Racket.
> Nevertheless, I am not sure whether it is relevant to the point I
> tried to make.  The "#!r6rs" does not indicate a particular language
> (so tools scanning for "#!r6rs" cannot assume that the file is indeed
> an R6RS program/library). 

I think I had missed that some of your remarks are specifically  about the
"#!r6rs" directive, not directives of the form "#!<identifier>" more generally. 
I agree that implementations have more responsibilities with respect to
"#!r6rs", that the presence of "#!r6rs" in a file is not enough to conclude 
that the file is an R6RS program/library, and that a straightforward 
implementation of "#!r6rs" as reading like "#lang r6rs" in the manner of my 
previous examples would not conform to R6RS.

Also, on the broader question, my first preference would be for Guile to 
implement `#lang language/wisp`, not least to avoid the confusing subtleties 
here and the potential for humans to confuse `#!language/wisp` with a shebang 
line. I raise the possibility of `#!language/wisp` only as an alternative if 
people are more comfortable using a mechanism that R6RS specifically designed 
for implementation-defined extensions.

Nonetheless, I'll try to explain why I think "#!r6rs" can be handled, and is 
handled by Racket, consistently with both "#lang r6rs" and the behavior 
specified in the report.

> Of course, R6RS gives implementations the freedom to modify the reader
> in whatever way after, say, "#!foo-baz" was read.  Thus, "#!foo-baz"
> could be defined to work like Racket's "#lang foo-baz," reading the
> rest of the source as "(module ...)".  But as long as we stay within
> the confines of R6RS, this will only raise an undefined exception
> because, in general, "module" is not globally bound.

Before getting to the general point, specifically about "module" not being 
bound: in Racket, a root-level `module` form is handled quite similarly to the 
`library` form in R6RS, which says in 7.1 [1]:

>>>> The names `library`, `export`, `import`, [...] appearing in the library 
syntax are part of the syntax and are not reserved, i.e., the same names can 
be used for other purposes within the library or even exported from or 
imported into a library with different meanings, without affecting their use in 
the `library` form. 

None of the libraries defined in R6RS export a binding for `library`: instead, 
the implementation must recognize it somehow, whether by handling it as a 
built-in or binding it in some environment not standardized by R6RS.

(The `racket/base` library/language does in fact export a binding for `module` 
which can be used to create submodules with the same syntax as a root-level 
`module`, but that isn't relevant to the handling of a `root-level` module 
form itself.)

> I don't want to contradict you; I just mean that a plain "#!r6rs"
> without a top-level language where "module" is bound is not equivalent
> to "#lang" and that trying to switch to, say,  Elisp mode with
> "#!elisp" would leave the boundaries of the Scheme reports (and when
> this is done, this specific discussion is moot).
> [...]
> (It must be compatible with calling the procedures "read" and "eval"
> directly, so "#!r6rs" must not wrap everything in some module form,
> say.)

Now I'll try to sketch Racket's handling of "#!r6rs" from an R6RS perspective. 
For the sake of a concrete example, lets consider this program:

(library (demo)
         (export x)
         (import (rnrs base))
  (define x
    (+ 1 #!r6rs 2)))

Using R6RS's `read`/`get-datum` and `write` on such input produces the datum 
(with linebreaks for legibility):

(library (demo)
         (export x)
         (import (rnrs base))
  (define x
    (+ 1 2)))

Racket is an implementation of the sort contemplated by Appendix A [2]:

>>>> [T]he default mode offered by a Scheme implementation may be non-
conformant, and such a Scheme implementation may require special settings or 
declarations to enter the report-conformant mode.

When Racket begins reading a module's source code, the reader is in a non-
conformant mode. The first "#!r6rs" lexeme is the required "declaration[] to 
enter the report-conformant mode". From that point on, the import is read with 
a reader as specified in R6RS, with no extensions. Thus, the second "#!r6rs" 
lexeme, as the report specifies, is treated as a comment. (Since the reader is 
already in strict R6RS mode, it has no side-effect.)

Racket's reader (as noted, in `with-module-reading-parameterization` mode) 
produces the following datum:

(module anonymous-module r6rs
   (library (demo)
            (export x)
            (import (rnrs base))
     (define x
       (+ 1 2)))))

Racket's reader has adjusted the "declaration[] to enter the report-conformant 
mode" to an explicitly-parenthesized form, but the portion of the input read 
in report-conformant mode produced the same datum as above.

The important point here is that the `read` and `eval` procedures from 
`racket/base` are not the same as the `read` and `eval` from `(rnrs io 
simple)` and `(rnrs eval)`, respectively. The R6RS version of `read` does not 
introduce a `module` form, and the R6RS version of `eval` happily evaluates 
the forms that the R6RS `read` produces.

It's a bit of a tangent here, but for the broader discussion about `#lang` or 
similar it might be interesting to note that, in addition to this 
"declaration[] to enter the report-conformant mode" that can be written in-
band in a report-conformant source file, Racket also has out-of-band ways to 
enter R6RS "report-conformant mode". In particular, Racket distributes an 
executable `plt-r6rs` that can run and compile R6RS programs that do not 
necessarily start with `#!r6rs`. [3] Invoking it instead with the form
`plt-r6rs --install ‹libraries-file›` will read `‹libraries-file›`, which need 
not begin with `#!r6rs`, with the R6RS-conformant reader. The ‹libraries-file› 
should contain R6RS library forms, each of which will be installed to its own 
file, located where Racket would expect to load the R6RS library with its 
declared name. In the process of installing the libraries, `plt-r6rs` adds a 
`#!r6rs` directive at the beginning of each file. [4]

> In an implementation that supports, say,
> R6RS and R7RS, "#!r6rs" can only switch the lexical syntax but cannot
> introduce forms that make the implementation change the semantics from
> R7RS to R6RS, e.g., in the case of unquoted vector literals.

I'm not very familiar with R7RS, and, quickly skimming R7RS Small, I didn't 
see a notion of directives other than `#!fold-case` and `#!no-fold-case`. 
(That's a bit ironic, given that the R6RS editors seem to have contemplated
`#!r7rs` *before* they considered `#!r6rs`.) I think a similar technique could 
work in this case, though. From an R6RS perspective, at least, an 
implementation could implement a directive such that the `read` from `(rnrs)` 
would parse:

>>>> #!r7rs #(1 2 3)


>>>> (quote #(1 2 3))

The other direction is a bit trickier, but the R7RS specification for `read` 
from `(scheme base)` does say that "implementations may support extended 
syntax to represent record types or other types that do not have datum 
representations." It seems an implementation could define a type "non-self-
evaluating-vector" and have `read` from `(scheme base)` produce a value of 
that type when given:

>>>> #!r6rs #(1 2 3)

Presumably `eval` from `(scheme eval)` would raise an error if asked to 
evaluate such a datum, as it does if asked to evaluate an unquoted (), but 
`quote` from `(scheme base)` would arrange to replace such a datum with a 

(I'm not at all sure that an implementation *should* do such a thing: I'm only 
trying to explain why I don't think the Scheme reports prohibit it.)



Attachment: signature.asc
Description: This is a digitally signed message part.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]