[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fix reader options for R6RS `get-datum'
Re: Fix reader options for R6RS `get-datum'
Mon, 17 Dec 2012 20:05:09 +0100
Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)
Mark H Weaver <address@hidden> writes:
> Andreas Rottmann <address@hidden> writes:
>> Mark H Weaver <address@hidden> writes:
>>> Section 8.3 defines 'read' as follows:
>>> Reads an external representation from textual-input-port and returns
>>> the datum it represents. The read procedure operates in the same way
>>> as get-datum, see section 8.2.9.
>>> I believe this last sentence clearly confirms my belief that 'read' and
>>> 'get-datum' should recognize the same syntax.
>> Well yes, R6RS `read' and R6RS `get-datum' need to understand the same
>> syntax, but I thought you were talking about Guile `read' and R6RS
> Ah, so you want R6RS 'read' to be different than Guile 'read'.
> I think this would be a mistake.
I think that's mandated by R6RS and the fact that Guile offers reader
options that plainly incompatible with the syntax described in R6RS.
> I'd like to allow coherent systems to be built from a mixture of R6RS
> code, R7RS code, native Guile code, etc. With this in mind, I think it
> would be terribly confusing for users (and not particularly sensible)
> for the notation recognized by 'read' to depend upon whether the code
> that happens to call 'read' is in an R6RS library or a Guile module.
Strictly speaking, it's not whether the code is in a "Guile module" or
"R6RS library" (both are actually "Guile modules" in Guile's
implementation of R6RS), but whether the binding imported for `read' is
Guile's core `scm_read' or the one from `(rnrs io simple)'.
> For example, the code that calls 'read' when compiling source files
> happens to be in a Guile module. What does that have to do with the
> language being read? Nothing.
>> Yup, R6RS `read' needs to be implemented in terms of `get-datum', not
>> only because of reader options, but also because of the required
>> exception behavior. This is how it's done already -- see
> I thought we agreed on IRC that this is an unworkable approach to
> supporting R6RS exceptions in Guile. That path leads to a future where
> there are two variants of every primitive procedure that might throw
> exceptions. It also means duplicating every VM instruction that might
> throw exceptions.
Yeah, but until exception conversion in the `guard' (or `catch') is
implemented, `get-datum' & co. still need to adhere to the
specification, i.e. throw the exceptions mandated by R6RS in the
circustances described therin. I don't think it is necessary to pull
the implementation strategy for exceptions into this discussion (even if
I mistakenly started with it ;-). The issue of reader options is and
orthogonal, if related, one, IMO.
> Those facts alone would be bad enough, but it gets worse. In a program
> composed of a mixture of R6RS and native Guile code, an R6RS exception
> handler should be able to properly catch an error that happened within
> native Guile code, and vice versa. That won't work with this approach
> of throwing R6RS-style exceptions from within R6RS primitives and
> Guile-style exceptions within Guile primitives.
> IMO, to create a coherent system that allows mixing of code, we need a
> single unified exception system that is sufficiently fine-grained (and
> provides enough information) to satisfy the needs of both R6RS exception
> handlers and legacy Guile exception handlers.
> At any given time, there might be exception handlers installed by both
> Guile 'catch' and R6RS 'guard'. The code that throws an exception has
> no way of knowing which kind of exception handler will catch it.
> Therefore, the conversion to native R6RS conditions needs to happen
> within the exception handler.
> Does that make sense? I thought we discussed this on IRC and agreed on
> this general approach.
Yeah, we agreed that this is where we want to arrive at, but please
let's discuss reader options only in this thread.
>>> On the flip side, if someone has enabled SRFI-105 curly-infix
>>> expressions, or any other reader extension that does not conflict with
>>> standard R6RS notation, then both 'get-datum' and 'read' should honor
>>> that setting.
>>> Does that make sense?
>> It does, and I think this is also what my patch implements, if I
>> understood both the code and your words correctly :-).
> To make this more concrete, let's consider two of the reader options
> that you'd apparently like to override within R6RS code:
> *** Case insensitivity (you would force case-sensitive mode in R6RS):
> R6RS appendix B specifies the following optional reader directives:
> and Guile 2.0.7 now supports this. Your patch would break this when
> 'read' is used within R6RS code. Furthermore, it would break in a
> strange way: #!fold-case or #!no-fold-case would take affect for the
> immediately following datum (or the containing datum if the directive is
> found within a list), but then the reader would revert to case-sensitive
> mode for subsequent datums.
OK, this makes sense. If we have per-port reader options _actually set_
by the contents of that port, these sensibly should override R6RS
syntax, even if they conflict with the "R6RS standard syntax". However,
taking over *global* reader options that contradict behavior expected by
R6RS code makes no sense. Let me make my intents more clear with an
example as well. Assume you write this piece of R6RS code:
(define (text->datums text)
(call-with-port (open-string-input-port text)
(let loop ((lst '()))
(let ((datum (read port)))
(if (eof-object? datum)
(cons datum lst)))))))
Now, you call `text->datums' with an argument that is within the allowed
syntax for R6RS (whether optional or non-optional), I want to ensure
that the result of the invocation is conformant with the R6RS syntax.
> *** Keyword style (you would disallow this option in R6RS):
> While it is true that ':' is one of the "extended alphabetic characters"
> allowed in identifiers (and therefore the standard requires that :foo be
> read as a normal symbol), this has _always_ been the case in every
> Scheme standard since at least the R2RS. Nonetheless, some users want a
> more convenient syntax for keywords, hence we have this reader option.
> It is off by default, but some users prefer to have it on. I don't see
> why this setting should be ignored if the code that calls 'read' happens
> to be in an R6RS library.
I think this is my assumption that you seem to disagree on: by using the
binding of `read' from `(rnrs io simple)', instead of the one provided
by Guile's core, the writer of the code using that binding has declared
that he wishes `read' to adhere to R6RS. Your suggestion would break
that code for any users who like to set reader options incompatible with
R6RS. The same was true with R5RS read, but with R6RS, the problem is
sharpened by the presence of libraries (and thus a way to combine code
in modular, defined way).
Let's assume the code in question is `text->datums' as given above,
placed in some library/module, then `(symbol? (car (text->datums
":foo")))' has to hold true, no matter the global reader options.
Otherwise the `read' provided by `(rnrs io simple)' would fail to
implement R6RS. Allowing the user to override syntax on a global level
(as opposed to on an per-port one) means breaking perfectly fine code. I
don't think you can have it both ways.
> Furthermore, I intend to add another reader directive to set the keyword
> option. If you override this option, it will break in the same manner
> as for #!fold-case as described above.
That per-port options should override the "language defaults"
(i.e. R6RS-compatible or full range of Guile's global reader options) is
a good point and completely in accordance with R6RS, IMHO: If the text
read from a port contains directives not defined in R6RS, R6RS of course
cannot say anything about the resulting objects read and their
relationship to the source text (imagine a hypothetical #!brainfuck
Andreas Rottmann -- <http://rotty.xx.vu/>
- [PATCH 2/3] Add internal API to specify reader options at reader invocation, (continued)