bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13041: 24.2; diacritic-fold-search


From: Juri Linkov
Subject: bug#13041: 24.2; diacritic-fold-search
Date: Sat, 08 Dec 2012 01:55:22 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (x86_64-pc-linux-gnu)

> - leave the text alone but give each string that should be handled
>   specially a text property with the normalized form.  In this case
>   searching has to pay attention to these properties, if present.
>
> - normalize the text and give each normalized string a text property
>   with the original text.  In this case searching will proceed as usual
>   but you have to restore the original text when done.

This reminds an idea that searching should take into account the text
displayed with the `display' property and other display-related properties.
It seems this is more difficult to implement.

> Also I don't know how to handle the return value and/or highlighting
> when, for example, finding a match for "suf" within "suffer".  For
> example, replacing each occurrence of "suf" with the empty string should
> leave us with "fer" here.

I believe such ligature characters should be handled as a whole,
i.e. "suf" doesn't match "suffer", only "suff" should match it.

> I have no idea how many mappings like "ß" -> "ss" exist.  The problem is
> that we don't get them from UnicodeData.txt IIUC.

I can't find them in UnicodeData.txt too.  Looking at the files in
http://www.unicode.org/Public/UNIDATA/ can find them in the file

http://www.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt

that is derived from

http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt





reply via email to

[Prev in Thread] Current Thread [Next in Thread]