bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Several of gawk's po files are not UTF-8


From: Eli Zaretskii
Subject: Re: [bug-gawk] Several of gawk's po files are not UTF-8
Date: Thu, 29 Jan 2015 19:04:37 +0200

> Date: Thu, 29 Jan 2015 17:39:51 +0100
> From: Federico Leva <address@hidden>
> CC: address@hidden
> 
> Eli Zaretskii, 29/01/2015 17:17:
> > Why do you think it's a mess?
> 
> Due to the downstream bug I linked.

I'm sorry, but I didn't catch the problem in that bug report.

> >> >he.po:    GNU gettext message catalogue, ISO-8859 text
> > This one is not in UTF-8 on purpose.
> 
> Thanks. Is the reason documented somewhere?

Not that I know of.

The reason is that the file is in visual order, not in logical order,
so it needed to be encoded in an encoding that specifies visual order.

> Nowadays a lot of software refuses to work with non-Unicode
> documents

Then that's the bug, IMO.  There's nothing wrong in other encodings,
provided that they are explicitly stated.

Moreover, some encodings are implicitly imbued with cultural
semantics, such as which fonts are preferred to display the text, and
there are some cultures that don't like Unicode for this very reason.
For example, the same Kanji character is expected to be displayed
differently in text encoded in SJIS and Big-5.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]