[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What is a preferred charset?

From: Eli Zaretskii
Subject: Re: What is a preferred charset?
Date: Thu, 22 Nov 2018 17:30:29 +0200

> From: Stephen Berman <address@hidden>
> Cc: address@hidden,  address@hidden
> Date: Thu, 22 Nov 2018 10:07:00 +0100
> > It is not a question of success or failure: every charset which
> > supports the character "succeeds".  We choose one of them in order to
> > produce the effect (such as select a font for displaying it) that
> > suits best what this particular user in this particular case expects.
> > When text comes from an encoding that specifies its charset (such as
> > Latin-N), we can determine that charset from the encoding; if not, we
> > use the charset-priority order that is determined by the locale, as
> > fallback.
> So "preferred charset" means "charset the encoding specifies, if any,
> otherwise the locale-specific highest priority charset"?

Yes, but that's not a useful definition, see below.

> If so, it's still not clear to me why HELLO specifies charsets that
> (at least in some cases, like INVERTED EXCLAMATION MARK) differ from
> the highest priority

Because it wants to demonstrate that Emacs is capable of using mixed
character sets in the same buffer, and still have each one displayed
as it would in its native locale.

> is it because the specified charsets are known to correctly
> display the characters regardless of locale (if that's even possible),
> while it's not known whether the highest priority charset can correctly
> display them?

No, the highest priority charset will also succeed in displaying
them.  But HELLO wants each greeting to be a good representative of
its native locale, regardless of the locale in which the Emacs session
showing HELLO runs.

I find the following description useful when thinking about this:
Emacs wants to know the charset of each character to be able to
display it correctly using the proper fonts (and also for a few other
features).  If the text announces its charset via the 'charset' text
property, Emacs uses that; otherwise it guesses using the locale's
defaults as guidelines.  It is similar to what Emacs does when it
needs to guess the encoding of a file.

> In any case, it's ok with me to drop this now, since it's
> become clear to me that "preferred charset" is not a technical term but
> a term of convenience used only by describe-char, and it hasn't bothered
> anyone till now (and I hadn't thought about it till now either).  Thanks
> for the feedback.

Thanks for pointing out how this display might be confusing; I have
now removed the "preferred" part from the display, and added
descriptions of how each attribute of the character is obtained, so
that interested users could drill down.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]