Re: On language-dependent defaults for character-folding

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On language-dependent defaults for character-folding

From:	John Wiegley
Subject:	Re: On language-dependent defaults for character-folding
Date:	Sat, 27 Feb 2016 00:58:02 -0800
User-agent:	Gnus/5.130014 (Ma Gnus v0.14) Emacs/24.5 (darwin)

>>>>> Eli Zaretskii <address@hidden> writes:

> The simplest change would be to have character-folding disabled by default
> in some European locales whose users expressed objections to having it on by
> default, due to folding of some characters that shouldn't be folded in the
> languages of those locales.

> Another, more complex, but still simple enough, possibility would be to have
> character-folding on by default, but have the problematic foldings filtered
> out from the regexp used by it. We could either always filter out all of
> them, or filter out only some of them, as determined by the user locale. For
> example, in the Spanish locales, ñ will not be folded.

> The next alternative is to come up with a fine-grained classification of
> character-folding, and provide user options to control each one of them
> independently, with the defaults determined by the user locale. For example,
> one class of folding is the one required for matching pre-composed
> characters such as á with its decomposed variant á; another class is for
> finding "similar" characters, such as finding ⒜ when looking for a. There
> should probably be classes that are disliked by users of certain languages,
> such as ñ for Spanish. Etc. etc. (I think this alternative needs more
> research and user feedback, and so is probably not for the release branch.)

> Maybe there are more alternatives, I don't know. It's not like they were
> explicitly proposed by someone; the above is just my personal conclusions
> from reading the discussion.

Thank you for that summary. From that reading, it sounds like this will
require a fairly complex decision tree, to determine what should be folded
when based on the details of each particular country/language? That is, we
can't expect to make a single decision up front, but will need feedback from
users in every country that uses Emacs, in order to determine what the correct
settings are for each language?

And what about a Swedish speaker living in America who uses en_US because
that's what 90% of his text is in, who then wants to search some Swedish text?
Is it the locale that determines it, or something specific to the nature of
the text in each buffer? And how would Emacs know?

Unless I'm not seeing the light at the end of this tunnel, this feature is
just not ready for prime-time as a default. There are too many unanswered
questions, and it sounds like none of them can be answered in the abstract for
every case. I have a feeling we'd be getting bug reports constantly from users
whose language contains details we never anticipated.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: On language-dependent defaults for character-folding, (continued)

Prev by Date: Re: On language-dependent defaults for character-folding
Next by Date: Emacs Hack Night in San Francisco next week; suggestions?
Previous by thread: Re: On language-dependent defaults for character-folding
Next by thread: Re: On language-dependent defaults for character-folding
Index(es):
- Date
- Thread