[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is a preferred charset?

From: Stephen Berman
Subject: Re: What is a preferred charset?
Date: Wed, 21 Nov 2018 17:48:37 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

On Wed, 21 Nov 2018 16:24:26 +0100 Andreas Schwab <address@hidden> wrote:

> On Nov 21 2018, Stephen Berman <address@hidden> wrote:
>> The discussion in bug#33445 made me realize that I don't know what
>> distinguishes a preferred charset from other charsets
> It's the first from (charset-priority-list) that can encode the
> character.  The priority is defined by the language environment.

On Wed, 21 Nov 2018 17:37:07 +0200 Eli Zaretskii <address@hidden> wrote:

> "Preferred" is used there in the sense of "highest priority".  See
> charset-priority-list, set-charset-priority, and char-charset.  They
> are described in the node "Character Sets" of the ELisp manual.
> I guess someone tried to say "highest-priority" in fewer characters,
> to avoid making the line too long.

I had read that section of the manual before posting and at first did
conclude that preferred meant highest priority, but the output of
describe-char in HELLO seemed to conflict with that:

>> For example, etc/HELLO uses the non-standard text/enriched
>> annotation "x-charset" to make `describe-char' show
>> "latin-iso8859-1" as the preferred charset of INVERTED EXCLAMATION
>> MARK (#xa1), whereas when I use `C-x 8' to enter that character in a
>> buffer `describe-char' says its preferred charset is "unicode".  Why
>> are there different preferred charsets in these cases and what's the
>> significance and use of that difference in general
> When text has the 'charset' property, we show its value as the
> highest-priority charset of the characters having that property.  This
> property is described in "Explicit Encoding".

On my system (where the value of locale-coding-system is utf-8-unix) the
first entries in charset-priority-list are: ascii iso-8859-1 unicode
latin-iso8859-1 ...  And calling char-charset on the character named
INVERTED EXCLAMATION MARK returns "unicode" here.  That accords with
what you both wrote above about highest priority, but...

> In the case of HELLO, each hello phrase was given the 'charset'
> property corresponding to its language's script, so as to instruct
> Emacs to choose the most appropriate font for that greeting.

...this seems to be a different criterion for preferred, not the highest
priority as defined above, but (maybe) the smallest charset able to
encode the character?

>> and should it be documented?
> Now that you know what this is about, you tell me ;-)

I'm still not sure.

Steve Berman

reply via email to

[Prev in Thread] Current Thread [Next in Thread]