emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character group folding in searches


From: Eli Zaretskii
Subject: Re: Character group folding in searches
Date: Sat, 07 Feb 2015 10:38:04 +0200

> Date: Fri, 6 Feb 2015 22:08:19 +0000
> From: Artur Malabarba <address@hidden>
> Cc: Stefan Monnier <address@hidden>, emacs-devel <address@hidden>
> 
> >> > Because the other way you cannot use char-tables.  And because
> >> > matching "a" and "á" will be hard the other way.
> >>
> >> Maybe I'm missing something, but if you have "á" expand to "a´", it
> >> won't match "a", will it?
> >
> > It will, if you only pay attention to the base character.
> 
> If you have the possibility of only paying attention to the base
> character (if the machinery is in place) then there's no reason to
> fold "á" into "a´" (folding 1 char into many).
> 
> Just fold everything into "a". Then (by only paying attention to the
> base character) "á" and "á" will match, because "á" folds into "a"
> which is the base character of "á".

But we need both capabilities, since whether or not a match of the
base character is enough depends on what the caller/user wants.
Folding everything into the base character supports only part of those
features.

As the simplest example, how can you have "á" and "a´" match, but "á"
and "a¨" fail to match, if you _only_ look at the base character?
(Btw, using ´ and ¨ here is incorrect, the correct characters are
their combining variants, u+0301 and u+0308; I left the ones you used
just for clarity, to prevent Emacs from composing a and the following
combining character.)

And then there are more complex examples, like "q̣̇" that should match
"q̣̇" (because the ordering of combining marks doesn't matter), etc.

What this tells to me is that we do need to fold "á" into "a´", and
then use a comparison function that pays attention to the "folding
options" specified by the caller, to decide which parts of the folded
sequence to ignore, and also how to compare the non-ignored parts
(e.g., with some options the order of non-base characters should not
matter).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]