[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wide-char is wide

From: Trevor Daniels
Subject: Re: wide-char is wide
Date: Wed, 25 Mar 2009 23:53:29 -0000

Robin Bannister Wednesday, March 25, 2009 4:17 PM

Francisco Vila. wrote:
the right googleable word is Unicode, do you agree?

Well, not fully.
When I google for > unicode arabic percent
I certainly end up at a relevant place

But I am not done.
I need to collect whatever it is \char needs,
so I go looking for hexadecimals.
There are lots of them in a nice table,
and they are not all saying the same thing.
This is where "UTF-32" could keep me straight.

Back to NR 3.3.3
The following example shows UTF-8 coded characters being used

My main point was: UTF-8 is wrong.

As this is describing the argument to \char you are
right.  \char takes the hexadecimal number representing
the Unicode code point. So the Arabic percent can be
x66a, x066a, x00066A, ...

When you criticize UTF-32 as a replacement, are you
implying that the next word "coded" is wrong too?

If we specify UTF-32 this would imply all the leading
zeros need to be expressed.  This is not required; any
valid hexadecimal representation of the integer is

If so, I agree.
The proper term is Unicode code point (mentioned at the top of 3.3.3) and it is just an integer - no need to constrain how it is represented.
(But base 16 and the codespace slicing went hand in hand.)

No problem with that.

So lets say
The following example shows Unicode code points being used

OK.  I agree "UTF-8" here is wrong.

And further up, lets use this same term instead of
 "Unicode escape sequence"  and  "Unicode hexadecimal code"

Happy to replace them but I prefer to use "Unicode
hexadecimal value".


reply via email to

[Prev in Thread] Current Thread [Next in Thread]