[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF

From: Paul Eggert
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Sun, 27 Sep 2015 01:22:48 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0

Eli Zaretskii wrote:
I've also looked at the *.po files in the latest releases of GNU Make,
Gawk, Texinfo, and Binutils, and I find that between 20% and 25% of
such files still use non-UTF-8 encodings.

Yes, and those files are a pain to look at with Emacs now, since it typically misguesses their encodings. Presumably Emacs should be looking at .po files' charset= decorations.

What's likely happening with those files is that they were originally created long ago in an 8-bit locale, and nobody has bothered to update their encodings since then. Many of the files haven't been changed in ages (about half of them have revision dates before 2010), and of course the older files will prefer legacy encodings. These older files are not a particularly good match for text that people edit today.

while I agree with you that UTF-8 encoded files are the majority
among non-ASCII files (and Emacs development aligns itself with that
fact very well), the non-UTF-8 minority, even in the Posix world, is
still significant enough, and we cannot possibly ignore it.

Naturally we cannot ignore it. All I'm suggesting is that we change the default behavior so that it's more UTF-8 friendly, since that's the way the world is going. The old Emacs behavior should still be available, for people who need it.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]