[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33044: Guile misbehaves in the "ja_JP.sjis" locale

From: John Cowan
Subject: bug#33044: Guile misbehaves in the "ja_JP.sjis" locale
Date: Tue, 16 Oct 2018 08:52:59 -0400

At this point, I'm inclined to believe that Shift_JIS is not suitable as
a locale encoding on POSIX systems, and that we should not try to
support it in Guile.

What do you think?

Can you tell me how backslash and tilde are represented in Shift JIS?

They aren't:  iconv is right.  Japanese Windows users are used to seeing Windows pathnames that look like "C:¥foo¥bar", and when writing C, to strings like "first line¥nsecond line."  So what is happening is that the character at #\x5C is *functionally* a backslash that is *displayed* as a yen sign.  This is reinforced by the fact that the round-trip mapping from Shift_JIS #\x5C is U+005C BACKSLASH, whereas U+00A5 YEN SIGN is mapped only from Unicode (or other encodings) to Shift_JIS, never the other way around.

This is the last survivor of the "national characters" concept of ISO 646, whereby certain 7-bit characters were interpreted differently in different countries.  For Scandinavian programmers, for example, blocks in C began with æ and ended with å rather than { and } respectively, and the logical OR operator was ø.  In the same way, British and Irish programmers used £ instead of # at the beginning of comments in awk and shell programs.  With the arrival of Latin-{1,2,3,4} this concept was eventually abandoned, and all systems converged on ISO-646-IRV (the same as US-ASCII) *except* Japanese systems.

So I recommend that you do what everyone else does and ignore the issue in JIS-based encodings, of which Shift_JIS is the only one in practical use (and it _is_ heavily used in Japan, where it is almost the only encoding for documents on desktops).   Just ignoring the encoding is not an option in Japan: see the comments by Joel Rees, Norman Diamond, and Ryan Thompson at the bug you pointed to.

John Cowan          http://vrici.lojban.org/~cowan        address@hidden
In might the Feanorians / that swore the unforgotten oath
brought war into Arvernien / with burning and with broken troth.
and Elwing from her fastness dim / then cast her in the waters wide,
but like a mew was swiftly borne, / uplifted o'er the roaring tide.
        --the Earendillinwe

reply via email to

[Prev in Thread] Current Thread [Next in Thread]