emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: search-default-mode char-fold-to-regexp and Greek Extended block cha


From: Robert Pluim
Subject: Re: search-default-mode char-fold-to-regexp and Greek Extended block characters, Re: search-default-mode char-fold-to-regexp and Greek Extended block characters
Date: Thu, 25 Jul 2019 22:44:29 +0200

>>>>> On Thu, 25 Jul 2019 21:40:12 +0300, Juri Linkov <address@hidden> said:

    >> Can you please explain why iota with dialytika and tonos needs to be
    >> special-cased in these places?

    Juri> Here is the test case that demonstrates the need to add it
    Juri> to char-fold-include:

    Juri> 0. emacs -Q
    Juri> 1. Paste this text to *scratch*: "ΐΐ"
    Juri> 2. Search for two IOTAs with char-fold, e.g.: C-s M-s ' ιι

    Juri> The char-fold search doesn't match the characters with combining 
accents
    Juri> with their base char GREEK SMALL LETTER IOTA.

    Juri> However, after adding (?ι "ΐ") to char-fold-include it can match the
    Juri> base character IOTA.

Yes, I see the problem now. Maybe this can be solved by adding that
mapping when building char-fold-table. Or 'those mappings' I should
say, since there are going to be many cases like this.

How about the following? It passes your tests with the FIXMEs
uncommented (and isearch for multiple iotas matches multiple iotas +
combining diacriticals).

I deliberately restricted it to lower case characters, since the
roundtripping fails for İ and a large number of titlecase characters.

diff --git i/lisp/char-fold.el w/lisp/char-fold.el
index f379229e6c..91fd7ddc28 100644
--- i/lisp/char-fold.el
+++ w/lisp/char-fold.el
@@ -108,6 +108,17 @@
                                     (car next-decomp)))
                            (funcall make-decomp-match-char (list (car 
next-decomp)) char)))
                      (setq dec next-decomp)))
+               ;; If there is no precomposed uppercase version of a
+               ;; character with diacriticals, we also add a mapping
+               ;; from the base character to the base character with
+               ;; combining diacriticals
+               (when (eq (get-char-code-property char 'general-category) 'Ll)
+                 (let* ((str (char-to-string char))
+                        (upper (upcase str))
+                        (roundtrip (downcase upper)))
+                   (when (> (length roundtrip) 1)
+                     (aset equiv (aref roundtrip 0)
+                           (cons roundtrip (aref equiv (aref roundtrip 0)))))))
                ;; Do it again, without the non-spacing characters.
                ;; This allows 'a' to match 'ä'.
                (let ((simpler-decomp nil)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]