[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode, ports and encoding

From: Ludovic Courtès
Subject: Re: Unicode, ports and encoding
Date: Tue, 17 Feb 2009 22:54:36 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.90 (gnu/linux)


Mike Gran <address@hidden> writes:

> 1.  To move to a Unicode-enabled guile, text information needs to be
>     converted to an internal representation when read and converted
>     back to the locale when written.  Most reading and writing for
>     ports passes through scm_getc (input) and scm_lfwrite (output).
>     Conversion between locale strings and internal strings should
>     happen there.

One strategy could be to have a new C port API, e.g., roughly based on
R6RS', with transcoders and all, and somehow arrange to have the current
port "API" mapped to that new shiny API.  It might be a bit ambitious,

>     This implies that a source code file should have syntax to
>     indicate its own encoding, if it is not ASCII.  Something akin to
>     the <?xml encoding="utf-8"?> line in HTML files.

One could imagine special treatment of, say, the first 10 lines of a
file, with the ability to recognize Emacs file variables like
"-*- coding: utf-8 -*-" and to change the current port transcoder
accordingly, something like that.

By default, which encoding is used by `read' would be determined by the
input port's encoder.

> 3.  The text encoding of a port needs to be associated with the port.
>     R6RS has the idea of transcoders for ports that require
>     conversion.  It is daunting, but, having played some ideas for a
>     few weeks, it seems that at least a subset of the transcoder
>     functionality needs to be implemented for this to make any sense.


> I sent in my copyright assignment last week, so you should have it
> now.


IIRC, the first step you suggested was the implementation of wide
string/char types.  Did you also work on this?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]