[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

char equivalence classes in search - why not symmetric?

From: Drew Adams
Subject: char equivalence classes in search - why not symmetric?
Date: Tue, 1 Sep 2015 08:46:26 -0700 (PDT)

When character folding is turned on, shouldn't you be able to
search for á and find (match) a, à, ã, ª, â, å, and ä?

I think so.  Currently you cannot - you can only do the reverse:
search for a and find any of the above.  a is treated specially.

I suppose that the logic behind the current implementation is
to mirror what we do with case-fold searching.  But is that the
right thing in this case?

For case-fold searching, it was thought that if you bother to
hold the Shift key and thus use an uppercase letter then you
want to match case, and otherwise you do not (case-insensitive).

This was essentially, I think, a shortcut for programmers, and
it was introduced at a time when much of the code being searched
was case-ambivalent.  (UNIX was still pretty much an exception
at that point, in distinguishing lowercase letters.)

Whether or not this behavior for case-fold is still a good thing
is questionable now, I think.  I don't think it is necessary now
or particularly useful.  And I think it can be confusing to
newbies.  Why should searching for A be different from searching
for a, wrt case matching?

But I'm not really questioning the behavior of case-fold
searching now.  I am questioning applying this same behavior
to char folding.

To me, folding a group of chars together for search purposes
should be symmetric - go both ways.  It should, in effect,
treat the given group of chars as equivalent - as an
equivalence class wrt searching.

Why not?  Why, when char folding, treat plain a specially for
searching?  Why not treat á, a, à, ã, ª, â, å, and ä the same?
Isn't that the point here?  We are telling Isearch that they
are equivalent.  Why pick one of them as the canonical
search-pattern to use for finding any of them?  Why privilege
a over á, a, à, ã, ª, â, å, and ä?

Now most of the time I, like most people, will by typing a
instead of á into a search string.  But that's not really the
point.  I think users should be able to use any members of an
equivalence class of chars indifferently.

And when it comes to chars other than letters, it might well
be that some users, with some keyboards, will find some chars
in an equivalence class easier to type than others.  Let them
use/type whichever they like, no?

This feature, welcome as it is, seems only half-baked, so far.
How about equality for char-folding equivalence?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]