[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] add language/wisp to Guile?

From: Maxime Devos
Subject: Re: [PATCH] add language/wisp to Guile?
Date: Sat, 4 Feb 2023 20:09:36 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0

On 04-02-2023 16:46, Dr. Arne Babenhauserheide wrote:
So I’d like to ask: can we merge Wisp as supported language into Guile?

 From some conversations elsewhere, I got the impression that

(use-modules (foo))

will search for foo.scm and not in foo.w.  I think you'll need to
tweak the loading mechanism to also look for foo.w instead of only
foo.scm, if not done already.

This needs an addition to the extensions via guile -x .w — I wrote that
in the documentation. I didn’t want to do that unconditionally, because
detecting a wisp file as scheme import would cause errors.

If done carefully, I don't think this situations would happen.
More precisely:

  * .w would be in the file extensions list.

  * Instead of a list, it would actually be a map from extensions to

      .scm -> scheme
      .w -> wisp

    With this change, (use-modules (foo)) will load 'foo.scm' as Scheme
    and 'foo.w' as Wisp.  (Assuming that foo.go is out-of-date or
    doesn't exist.)

    (For backwards compatibility, I think %load-extensions needs to
    remain a list of strings, but a %extension-language variable could
    be defined.)

  * "guile --language=whatever foo" loads foo as whatever, regardless
    of the extension of 'foo' (if a specific language is requested,
    then the user knows best).

  * "guile foo" without --language will look up the extension of foo in
    the extension map. If an entry exists, it would use the
    corresponding language.  If no entry exists, it would use
    a default language (scheme).

With these changes, I don't think that Wisp code would be detected as Scheme or the other way around.

Is there a way to only extend the loading mechanism to detect .w when
language is changed to wisp?

Regardless of whether it's technically possible, that sounds insufficient to me.

Suppose someone writes a library 'Foo' in Wisp.
Suppose I write a library 'Bar' in parenthese-y Scheme, that happens to use the Foo library as a dependency.

Then when compiling Bar or running its tests, it will be done in the Scheme language, and additionally assuming that compiled .go are available for Foo, then the language will never be changed to Wisp, and hence .w will never be added to %load-extensions.

As such, the or equivalent of Foo would need to be converted to Wisp, or '-x w' would need to be added.

I don't care what language the library Foo is written in, and my library Bar isn't written in Wisp so it seems unreasonable to have to add -x w. (It wouldn't be too much trouble, but still not something that should have to be done _in Bar_, as the Wispyness of Foo is just an implementation detail of Foo, not Bar.)

Worse, adding the Wispy library Foo of the parenthese-y library Bar would be an incompatible change, as parenthese-y dependents of Foo would need to add '-x w' in places whereas they didn't to previously. It's easily resolvable, but I think it would be very annoying as well.

readable uses

This sentence appears to be incomplete; I might have misinterpreted it below (I don't know what you mean with 'readable' -- its an adjective and you are using it as a noun?).

(set! %load-extensions (cons ".sscm" %load-extensions))

Would that be the correct way of doing this?

I assume you meant ".w" instead of ".sscm". I don't quite see how this would be an answer to:

  Is there a way to only extend the loading mechanism to detect .w when
  language is changed to wisp?

More precisely, I'm missing how it addresses 'only ... when the language is changed to wisp'.

FWIW, it appears to be an answer to the following unasked question:

  How to make Guile accept "foo.go" when "foo.w" exists and is

Also, I think that when foo.go exists, but foo.scm doesn't, then Guile
refuses to load foo.scm, though I'm less sure of that. If this is the
case, I propose removing the requirement that the source code is
available, or alternatively keep the 'source code available'
requirement and also accept 'foo.w', if not done already.

I think accepting any extension supported by any language in Guile would
be better.

This sounds like the second proposal ('alternatively ...'), but the way it is written, you appear to proposing it as a third proposal. Is this the case?

(I mean, after this patch, Wisp is a supported language, so it seems equivalent to me.)

+; Set locale to something which supports unicode. Required to avoid
using fluids.
+(catch #t

  * Why avoid fluids?

I’m not sure anymore. It has been years since I wrote that code …

I think it was because I did not understand what that would mean for the
program. And I actually still don’t know …

Hoow would I do that instead with fluids?

  * Assuming for sake of argument that fluids are to be avoided,
    what is the point of setting the locale to something supporting

I had problems with reading unicode symbols. Things like
define (Σ . args) : apply + args
> [...]>
This is to ensure that Wisp are always read as Unicode. Since it uses
regular (read) as part of parsing, it must affect (read), too.

OK.  So, Wisp files are supposed to be UTF-8, no matter the locale?
AFAICT, the SRFI-119 document does not mention this UTF-8 (or UTF-16, or ...) requirement anywhere, this seems like an omission in <> to me.

First, I would like to point out the following part of
‘(guile)The Top of a Script File’:

   • If this source code file is not ASCII or ISO-8859-1 encoded, a
     coding declaration such as ‘coding: utf-8’ should appear in a
     comment somewhere in the first five lines of the file: see *note
     Character Encoding of Source Files::.

oing by this, it is already possible to ask Guile to read the Scheme files as UTF-8; presumably the relevant bits could be copied over to Wisp. (I don't know if this applies to non-script files, but I'd assume so.)

It's not 'UTF-8 by default', but it can be 'close enough', and doing 'always UTF-8 even if coding: something-else' would be inconsistent with the Scheme language, so I ask you to consider whether it's worth (and perhaps the answer is 'yes').

(OTOH, (guile)Character Encoding says 'In the absence of any hints, UTF-8 is assumed.' which appears to suffice for you, but it also contradicts "If this source file is not ASCII or ISO-8859-1 encodes, ...", so I don't know what precisely is going on here.)

If you aren't going for the 'coding: ...' stuff or porting the encoding autodetection from Scheme to Wisp, here's an alternative solution:

Keep in mind that encodings are a per-port property -- the locale might have a default encoding, and ports by default take the encoding from %default-port-encoding or the locale (I think), but you can override the port encoding:

 -- Scheme Procedure: set-port-encoding! port enc
 -- C Function: scm_set_port_encoding_x (port, enc)
     Sets the character encoding that will be used to interpret I/O to
     PORT.  ENC is a string containing the name of an encoding.  Valid
     encoding names are those defined by IANA
     (, for example
     ‘"UTF-8"’ or ‘"ISO-8859-1"’.

As such, I propose calling set-port-encoding! right in the beginning of read-one-wisp-sexp.

Also, unrelated, I now noticed some dead code you can remove:

+(define wisp-pending-sexps (list))

> [...]

+(define (wisp-replace-paren-quotation-repr code)
+         "Replace lists starting with a quotation symbol by
+         quoted lists."
+         (match code
+             (('REPR-QUOTE-e749c73d-c826-47e2-a798-c16c13cb89dd a ...)
+                (list 'quote (map wisp-replace-paren-quotation-repr a)))
+(define wisp-uuid "e749c73d-c826-47e2-a798-c16c13cb89dd")
+; define an intermediate dot replacement with UUID to avoid clashes.
+(define repr-dot ; .
+       (string->symbol (string-append "REPR-DOT-" wisp-uuid)))

There is a risk of collision -- e.g., suppose that someone translates
your implementation of Wisp into Wisp.  I imagine there might be a
risk of misinterpreting the 'REPR-QUOTE-...' in
wisp-replace-parent-quotation-repr, though I haven't tried it out.

This is actually auto-translated from wisp via wisp2lisp :-)

As such, assuming this actually works, I propose using uninterned
symbols instead, e.g.:

(define repr-dot (make-symbol "REPR-DOT")).

That looks better — does uninterned symbol mean it can’t be

Yes. This is because 'read' only reads interned symbols; uninterned symbols are unreadable:

scheme@(guile-user)> (make-symbol "foo")
$1 = #<uninterned-symbol foo 7f17efab7240>
scheme@(guile-user)> #<uninterned-symbol foo 7f17efab7240>
While reading expression:
#<unknown port>:2:3: Unknown # object: "#<"

Also: (eq? (make-symbol "stuff") 'stuff) -> #false.

Can I (match l ...) on uninterned symbols? They are used to match on
precisely these symbols later.

Yes, but it's going to look differently and more verbose:

(define interned-symbol1 (make-symbol "foo1"))
(define interned-symbol2 (make-symbol "foo2"))
(match symbol
  ((? (lambda (x)
        (eq? x interned-symbol1)))
  ((? (lambda (x)
        (eq? x interned-symbol2)))

-- basically, replace 'stuff by (? (lambda (x) ...)).

Can I write it into a string and then read it back?

No. If you could, then uninterned symbols wouldn't be uninterned anymore, but rather a separation of symbols in two kinds that pretty much behave the same, and then you would again have a (very low) risk of a collision:

When I see them, I have to turn them into a different representation
that I can then write back into the string and allow it to be read by
the normal reader.

That's the case for the old code, but AFAIK it is only done in the following ...

If this change is done, you might need to replace

+             ;; literal array as start of a line: # (a b) c -> (#(a b) c)
+             ((#\# a ...)
+               (with-input-from-string ;; hack to defer to read
+                   (string-append "#"
+                       (with-output-to-string
+                           (λ ()
+                             (write (map
wisp-replace-paren-quotation-repr a)
+                                     (current-output-port)))))
+                   read)) >>
(unverified -- I think removing this is unneeded but I don't
understand this REPR-... stuff well enough).

..., for which I proposed a replacement, so do you still need to turn it in a string & back?

The REPR supports the syntactic sugar like '(...) for (quote ...) by turning
(' ...) into '(...).

Also it is needed to turn ((. a b c)) into (a b c).

However the literal array is used to make it possible to define
procedure properties which need a literal array.

Also, I wonder if you could just do something like

   (apply vector (map wisp-replace-paren-quotation-repr a))

instead of this 'hack to defer to read' thing.  This seems simpler to
me and equivalent.

That looks much cleaner. Thank you!

This sounds positive, but it is unclear to me if I have found a solution, because of your negative "However the literal array is used to make it possible to define procedure properties which need a literal array." comment.

Do I need to look into solving the 'literal array and procedure properties' stuff, or does the (apply vector (map ...)) suffice as-is?

(If there is 'literal array and procedure properties' stuff to be solved, you will need to elaborate on what you mean, because arrays aren't procedures and procedures aren't arrays -- maybe you meant 'object properties'?)


Attachment: OpenPGP_0x49E3EE22191725EE.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]