[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22392: 25.0.50; NS Emacs run from OS X GUI doesn't set locale

From: Alan Third
Subject: bug#22392: 25.0.50; NS Emacs run from OS X GUI doesn't set locale
Date: Mon, 18 Jan 2016 23:11:43 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (darwin)

Random832 <address@hidden> writes:

> On Mon, Jan 18, 2016, at 16:12, Alan Third wrote:
>> I don't know if it's appropriate for OS X, but I'm pretty sure it
>> matches the codings that the Windows port gives me for en_GB (ENG, in
>> Windows). Besides, surely it's better than 'nil'?
> Well, I don't have any trouble opening UTF-8 files. I'm incidentally not
> sure that it's really appropriate for windows, either - there, it should
> be using windows-1252, not iso-latin-1. I have to wonder how Emacs
> behaves on versions of windows whose default codepage is not a trivial
> superset of an ISO one. The proper encoding should be able to be
> determined by the GetACP function, and should always be a windows
> codepage.

I realised almost immediately after sending the message that this is
crap. What I was thinking was that in Windows I don't get a UTF-8
coding. You're entirely right.

>> The other possibility is that Terminal.app sets LANG to 'en_GB.UTF-8'.
>> That final part may be the difference we're seeing here?
> Yes, I think so. I was wondering also if there's some hidden setting
> that tells Terminal whether to use UTF-8 or not - I don't think it used
> it in the earliest versions of OSX.

There's a setting in Profiles -> Advanced that lets you select UTF-8.
It's UTF-8 by default on my system. If I change it I get just "en_GB".

Just to test I changed my code to append ".UTF-8" on the end of what
it's pulling from the system (so on my machine LANG gets set to
"en_GB.UTF-8", and here's the output from c-H C RET:

Coding system for saving this buffer:
  Not set locally, use the default.
Default coding system (for new files):
  U -- utf-8-unix (alias: mule-utf-8-unix)

Coding system for keyboard input:
  U -- utf-8-unix (alias: mule-utf-8-unix)

Coding system for terminal output:
  U -- utf-8-unix (alias: mule-utf-8-unix)

Coding system for inter-client cut and paste:
Defaults for subprocess I/O:
  decoding: U -- utf-8-unix (alias: mule-utf-8-unix)

  encoding: U -- utf-8-unix (alias: mule-utf-8-unix)

Priority order for recognizing coding systems when reading files:
  1. utf-8 (alias: mule-utf-8)
  2. iso-2022-7bit 
  3. iso-latin-1 (alias: iso-8859-1 latin-1)
  4. iso-2022-7bit-lock (alias: iso-2022-int-1)
  5. iso-2022-8bit-ss2 

So that does make a difference.

The question is what is the correct behaviour here? I, like you, would
rather get UTF-8 everywhere, but is that the *correct* behaviour for an
unconfigured system run from the GUI?

>> > This one also makes me wonder if the encoding specified in
>> > .CFUserTextEncoding/__CF_USER_TEXT_ENCODING should be used for a second
>> > choice. Which may be an encoding that may not map directly to a locale.

Possibly. I believe there's a way to extract a list of preferred
languages from OS X which are separate from the selected locale. That
may be a better way? Although, thinking about it, languages don't
necessarily map to encodings either.

> You may have a file .CFUserTextEncoding in your home directory, or an
> environment variable __CF_USER_TEXT_ENCODING, specifying a value like
> [0x1F5:]0x0:0x0 - the first one (in the environment variable only) is
> your user ID, the next is the encoding (0x0 for MacRoman) which Finder
> uses for preview of non-UTF8 text files, and the last is the language (0
> for English, maybe only US English)

Ah yes, I have exactly the same as you.
Alan Third

reply via email to

[Prev in Thread] Current Thread [Next in Thread]