bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22038: 25.1.50; Character folding issues with isearch


From: Stephen Berman
Subject: bug#22038: 25.1.50; Character folding issues with isearch
Date: Sat, 28 Nov 2015 19:26:20 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)

On Sat, 28 Nov 2015 19:40:26 +0200 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: 22038@debbugs.gnu.org
>> Date: Sat, 28 Nov 2015 18:10:53 +0100
>> 
>> > (That's the only way I could parse "multiple characters matching a
>> > single string".)  We will have that, but it won't allow "ss" to match
>> > "ß", unless you customize character-fold-table to include that.  The
>> > reason is that "ß" doesn't have any decompositions in the Unicode
>> > database, so the default character-fold-table doesn't include any
>> > expansions for it.
>> 
>> This suggests to me that basing character folding solely on character
>> decomposition is insufficient.  From a user's point of view I see no
>> reason why the search string "a" under character-folding matches "ä" but
>> not e.g. "æ".  Requiring a customization to get the latter strikes me as
>> a user-unfriendly crutch to work around a deficient implementation.  (I
>> don't know if it's easy to improve, I'm just giving my impression as a
>> user.)
>
> Easiness is not the most important issue here: there's a more basic
> problem involved.  Both "ß" vs "ss" and "æ" vs "a" (or "ae") are
> language-specific: they are only valid matches in the context of
> specific languages.  AFAIU, that is why they are not in the Unicode
> database.  And we don't yet have language-specific text processing
> capabilities and infrastructure (well, string-collate-lessp and
> string-collate-equalp are a beginning, but only that).  So allowing
> those by default risk running afoul of what users want.

I'm not sure what you mean by "only valid matches in the context of
specific languages", but it sounds like what Per Starbäck said about "ä"
being considered a completely separate character from "a" in Swedish,
unlike in German.  Yet if this is a language-specific difference, Emacs
doesn't respect it by default, since "a" does match "ä" under
character-folding.  (Or does it fail to do so when
current-language-environment is Swedish?  I suspect it doesn't.)

But I know nothing about the Unicode specifications; maybe you are
referring to a more subtle issue, which may be unrelated to my point,
which is simply that I think it should be just as convenient for a user
whose keyboard may lack "ß" or "æ" to match these characters by
searching with "s" or "a" (or "e" or "ae") as it is to match "ff" by
searching with "f".  This is not a language-specific issue AFAICS.

Steve Berman





reply via email to

[Prev in Thread] Current Thread [Next in Thread]