[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fri, 11 May 2001 14:38:04 -0400
> From: Neil Jerram <address@hidden>
> >>>>> "Jim" == Jim Blandy <address@hidden> writes:
> Jim> Unicode is (or was) controversial in Japan. For the benefit
> Jim> of other readers, I'll summarize my understanding of the
> Jim> conflict. [...]
> Thanks - very interesting!
> Jim> So, essentially, this means that all Japanese programmers are
> Jim> accustomed to having text indicate not only the characters,
> Jim> but also the *language* those characters represent. In
> Jim> particular, they feel it is important that the encoding
> Jim> distinguish between Chinese text and Japanese text. Now,
> Jim> they all agree that Chinese and Japanese use the same
> Jim> characters. [...]
> So if one comes across a piece of paper with Chinese characters
> written on it, how does one know whether to read it as Chinese text or
> as Japanese text?
For very short pieces it may not matter. Even though the characters
are pronounced differently by Chinese and Japanese (and even different
dialects of Chinese) they have mostly the same meaning in all languages.
For longer pieces, Japanese rarely goes for more than half a line of
Hanzi (Chinese) without inserting a few of their other two scripts.
The Katakana and Hirigana are generally fewer strokes and more
curved. If you see anything that looks like a fat "6" flopped
forward on its face, it's Japanese. There are no circles in Chinese.
> As you also say, this is not specific to Japanese/Chinese; it happens
> for almost all language combinations, and is usually solved (in the
> brain) by assessing the context in which the characters appear.
> So, is it that the language information crept in by accident and then
> programmers found that it had particular uses?
That's how it seems to me. I like Unicode. It extends Ascii
to represent any text, without trying to be a universal
natural language understanding AI, or degenerating into a
Postscript-like ink shape coder. Of course there are some
compromises. For an example closer to home, it waffles on
the meaning of '\n'. Ascii compatibility is more trouble
than the entire Eastern hemisphere.
I don't like to argue with the Japanese about their own language, but
I think if Unicode had been a few years earlier, the alternatives
would never have happened.
-- Keith Wright <address@hidden>
Programmer in Chief, Free Computer Shop <http://www.free-comp-shop.com>
--- Food, Shelter, Source code. ---