[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] ANNOUNCE: Spaces in filenames, finished

From: Jan Hudec
Subject: Re: [Gnu-arch-users] ANNOUNCE: Spaces in filenames, finished
Date: Thu, 18 Mar 2004 20:52:37 +0100
User-agent: Mutt/

On Thu, Mar 18, 2004 at 11:36:39 -0800, Tom Lord wrote:
>     > From: Jan Hudec <address@hidden>
>     > Actualy, there are two escaping shchemes already widely used:
>     >     1) XML Entities: &#<unicode-codepoint-number>;
>     >     2) Old C's scheme extended for unicode:
>     >        \u<unicode-codepoint-number-in hexa>
>     > Why does hackerlab/pika/tla invent another one?
>     > (The argument that R5RS only requires \" is clear to me, but does not
>     > look sufficient to me).
> Regarding C's: how many hex digits go there?  It seems to me that the
> only sane ways to escape unicode codepoints in a string all involve a
> delimited hex number.  C's syntax is also no good for having
> multi-character symbolic names for characters (for the same reason --
> delimeter issues).

Realized that when writing next mail. Tcl's syntax uses 4, which won't
do. Java uses \u with 4 and is not correct either! Python uses \u with
4 and \U with 8. Not sure about java.  Perl's syntax is \x{<hex-number>}
is about as ugly as anything else (but is correct).

It seems several programming languages agree on \u<4-hex-digits>, but
that's not sufficient. Some languages does not seem to have noticed
unicode has more than 2^16 codepoints at all, those that have use
different syntax.

> Regarding XML: it's simply a judgement call on my part about where the
> greatest potential for synergies arise.  XML sure gets a lot of
> exercise, but the toolsets that have arisen around it don't really
> impress me as giving a lot of leverage in this situation (and, not
> without reason).  The Scheme-derived syntax, which I'm working with
> some sucess on turning into an Official Standard, I think will turn
> out to work better with other tools in the medium to longer term.
> It's just a judgement call....I can't prove it.  Yes, it's an
> "underdog" bet but, that's where I laid my money down.

Programmers are used to \ and will probably tend to stick to it.
Hopefuly one day there will be an agreement on some sane quoting -- or
everything will be in utf-8 so it won't be that much an issue.

                                                 Jan 'Bulb' Hudec 

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]