[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: char equivalence classes in search - why not symmetric?

From: Drew Adams
Subject: RE: char equivalence classes in search - why not symmetric?
Date: Thu, 10 Sep 2015 14:46:39 -0700 (PDT)

Yesterday I said:

 > 2. The code I have is not sufficient for everything.  You can
 > use it to see what the behavior is for single-char entries in the
 > char table, which includes accented chars (chars with diacritics).
 > But it does not also handle multiple-char entries in the table.
 > For instance, you can search for "é" and get char folding, but you
 > cannot search for "é" and get char folding.  The first of these is
 > just the char named LATIN SMALL LETTER E WITH ACUTE.  The second is
 > plain "e" composed with "́" (the char named COMBINING ACUTE ACCENT).
 > Some more work would be needed to make such combinations work too.
 > As I said, I'm no expert on char tables.  But the attached code
 > should give you a good idea of what is involved.

The attached version seems to take care of this, so you can search
with, say, the decomposition "é" and get the same effect as
searching for the fully composed char "é".

Again, just load the file, to try it out.  Remember that M-s '
toggles char folding.

At the end of the file there are a few strings you can use to test.
When you see two consecutive strings there that look the same, the
first is a decomposition, and the second is the same char fully

For example: "é" "é".  (The first string is two chars, however it
might be displayed.)

`C-u C-x =' on the first char of the first string tells you:
LATIN SMALL LETTER E, decomposition: (101) ('e')
and on the second char it tells you:
COMBINING ACUTE ACCENT, decomposition: (769) ('́').

`C-u C-x =' on the single char of the second string tells you:
LATIN SMALL LETTER E WITH ACUTE, decomposition: (101 769) ('e' '́')

Attachment: symmetric-char-fold.el
Description: Binary data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]