[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Per-module reader

From: Ludovic Courtès
Subject: Re: [PATCH] Per-module reader
Date: Tue, 27 Sep 2005 10:21:15 +0200
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

Hi Neil,

Thanks for this detailed analysis!

Neil Jerram <address@hidden> writes:

> This is nice; but concerns about whether it fits with transformer are:
> - Transformer is specified with #:use-syntax followed by module name,
>   which is a lot different from your syntax.  I prefer your syntax,
>   because use-syntax requires the transformer procedure to have the
>   same name as the module it's defined in, which seems rubbish.  But
>   on the other hand perhaps there is a reason for forcing the
>   reader/transformer to be defined in another module - are we missing
>   some subtlety here?

First of all, I've always thought of the reader as something quite
low-level, compared to macros.  And I made the assumption that the
transformer is something designed to implement macros.  In fact, that is
more or less what `boot-9.scm' says:

   - transformer: either #f or a function taking a scheme expression as
     delivered by read.  If it is a function, it will be called to perform
     syntax transformations (e. g. makro expansion) on the given scheme

What I'm trying to address here is something that cannot be addressed
using macros, nor by tweaking a module's transformer, precisely because
the transformer only sees expressions returned by the reader.

But I may be missing some subtlety, indeed.  ;-)

> - As well as the inline definition option shown here, does it also
>   work if the #:reader arg is a proc defined in one of the used
>   modules?  I guess it can't possibly work in the circular reference
>   case -- i.e. where module A uses a reader defined in module B, but B
>   also uses A and B was loaded first -- so this is probably deemed
>   user error - but how is that error presented to the user?

The `#:reader' argument must be a procedure either defined in the root
module, or in one of the used modules declared in the `define-module'
clause (it can't be a procedure defined in a subsequent `(use-modules
...)' form).

As for circular dependencies, I think there's nothing special here: the
problem is the same as for any other binding.

> - Does it make sense to have both #:reader and #:use-syntax?  Should
>   we reimplement use-syntax as a special case of #:reader?  Does it
>   make sense that the places where your reader affects libguile
>   (i.e. primitive-load) are different from the places where the module
>   transformer affects libguile (i.e. the various versions of eval)?

Yes, I think it makes sense.  In some cases, a special reader and an
appropriate transformer may both be used to solve the same problem
(e.g. implementing the Elisp character syntax).  However, the reader
approach is lower-level and may be used to implement other things not
implementable via a transformer (e.g. Elisp vectors, various keyword
syntaxes, etc.).

So I would say that it makes sense to keep both mechanisms side by side.

>> Index: ice-9/boot-9.scm
>> ===================================================================
>> RCS file: /cvsroot/guile/guile/guile-core/ice-9/boot-9.scm,v
>> retrieving revision 1.351
>> diff -u -B -b -p -r1.351 boot-9.scm
>> --- ice-9/boot-9.scm 31 Jul 2005 23:36:50 -0000      1.351
>> +++ ice-9/boot-9.scm 13 Sep 2005 12:28:08 -0000
>> @@ -1185,7 +1185,8 @@
>>    (make-record-type 'module
>>                  '(obarray uses binder eval-closure transformer name kind
>>                    duplicates-handlers duplicates-interface
>> -                  observers weak-observers observer-id)
>> +                  observers weak-observers observer-id
>> +                  reader)
>>                  %print-module))
>>  ;; make-module &opt size uses binder
>> @@ -1221,7 +1222,8 @@
>>                                        uses binder #f #f #f #f #f #f
>>                                        '()
>>                                        (make-weak-value-hash-table 31)
>> -                                      0)))
>> +                                      0
>> +                                      read)))
> Init value "read" here prevents your optimization in load.c from
> taking effect - so don't you want #f ?

Yes I do.  :-)  But it looked nicer and easier to understand to put
`read' here.

> I'm worried that this could reduce performance noticeably, and it
> seems unnecessarily inefficient.  So ... <pause> ...

The two main sources of inefficiency are (i) the use of `scm_call'
instead of a direct call to the reader subr, and (ii) the successive
calls to `scm_current_module ()'.

In the simple case, (i) can be avoided by having modules' reader default
to `#f' as you noted earlier.  As for (ii), we would greatly benefit
from having an inlined version of `scm_current_module ()'.

Anyway, given all the code that gets executed when reading from a file,
I don't think this has a noticeable impact on performance.

> Surely the low level concept here is that an alternative reader is a
> property of a port?  In that case, I would expect that
> - the switch to the alternative reader would be implemented inside
>   scm_read, using a new field on scm_t_port which can be set/read
>   using set-port-reader!/port-reader
> - the module level #:reader option would be implemented by
>   define-module calling set-port-reader! on current-load-port.  (Note
>   BTW that define-module should also do this when called to switch
>   context to an already existing module.)

Ports are lower-level abstractions and I think putting the notion of a
reader into it would look odd.  IMO, the reader option really belongs to
modules, not ports.

> Note that this also raises the question of what to do when someone
> types (define-module (module-with-reader)) at the REPL ... perhaps
> define-module should set the reader of the current input port also?

You mean: `(define-module (module-with-reader) #:reader my-read)'?
Well, that doesn't work because the REPL goes on using `read':

  guile> (define-module (hello)
           #:reader (lambda args (display "hello\n") (apply read args)))
  #<directory (hello) 30083450>
  guile> (+ 2 2)

However, this works fine:

  guile> (let ((real-read read))
  (set! read (lambda args (display "hi!\n") (apply real-read args))))
  guile> hi!
  (+ 2 2)
  guile> hi!

Conclusion: `define-module' should also set `read' to
`(module-reader (current-module))'.  This is a bug.

>> Index: libguile/modules.h
>> ===================================================================
>> RCS file: /cvsroot/guile/guile/guile-core/libguile/modules.h,v
>> retrieving revision 1.28
>> diff -u -B -b -p -r1.28 modules.h
>> --- libguile/modules.h       31 Jul 2005 23:36:14 -0000      1.28
>> +++ libguile/modules.h       13 Sep 2005 12:28:08 -0000
>> @@ -45,6 +45,7 @@ SCM_API scm_t_bits scm_module_tag;
>>  #define scm_module_index_binder             2
>>  #define scm_module_index_eval_closure       3
>>  #define scm_module_index_transformer        4
>> +#define scm_module_index_reader        12
> Why 12?  From the context I'd expect 5.

Why not?  ;-)

Because this has to be consistent with the field offsets of the
`module-type' record type defined in `boot-9.scm'.  Adding the `reader'
field as the last one (offset 12) allows to maintain binary
compatibility -- although I doubt there exists code that accesses the
fields `duplicates-handlers', `observers', etc., from C given that there
aren't even macros to do it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]