[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: char equivalence classes in search - why not symmetric?

From: Drew Adams
Subject: RE: char equivalence classes in search - why not symmetric?
Date: Tue, 8 Sep 2015 07:24:09 -0700 (PDT)

> The discussion here is entirely about the DWIM
> UI of isearch that allows requesting strict matching by having at
> least one uppercase or accented character, even though lax mode is
> enabled.

The proposal is explicitly *not* for the former, now.  The weird
exception of an uppercase letter making the current search be
case-sensitive, even though you have toggled case sensitivity
OFF, is not under attack now.

Personally, yes, I would get rid of that anomaly too at some
point, but I'm not proposing that now.  Likewise, for the
anomaly that whitespace folding is switched off by SPC SPC.
That too, I would like to see removed eventually, but I'm not
proposing that now either.

The point now is to DTRT wrt char folding - the new feature.

> Drew prefers a UI that enables/disables strict mode using a
> special isearch command bound to a key.

We already have that.  What I'm proposing in this thread is
that when char folding is on, it work symmetrically: Folding
should let you use `é' in the search string to match any of
the accented or unaccented variants, just as it does for `e'
in the search string.

Nothing more.  What's good for `e' should be good for `é' and
all the rest.  It's about equivalence classes.  There is no
reason to limit search strings to one privileged member of
an equivalence class when trying to match any members of the
class.  That's all.

> That would be plausible, if the DWIM
> UI for case fold search in isearch weren't 3 decades old.

See above.  I am *not* now proposing a change to case-fold
behavior.  I've made that clear from the beginning, and
repeated it several times now.

But it seems that it is easier, for those not favorable to
what I (and Juri, apparently) propose, to harp on the age-old
anomaly of uppercase case-fold annulment as, somehow (?), an
argument against clean, symmetric char folding.

Please argue about the topic at hand (see Subject line),
not whether the 1980s decision to make an exception for
an uppercase letter in the search string was or is a
good idea.

> ut the DWIM UI *is* 3 decades old, and successful.  Drew
> disputes that, 

No, Drew does not.  You cannot show one place where anything
Drew has written written suggests that he disputes that.

> but in the 25 years I've followed Emacs development this is
> the first time I've seen anybody complain about the DWIM-ish
> case folding feature.

Live and learn. ;-)  That is not the topic of this thread,
in any case.

> Note that incremental case-folded search (usually with no escape for
> strict matching!) has been widely adopted in web and file browsers.

Uh, no.  Case folding, yes.  But not case folding that
switches off (becoming case-sensitive) just because you
include an uppercase letter in the search string.  Not in any
browser I have, at least.  Nor in Notepad or TextPad or other
simple editors that newbies or non-programmers might be used to.

But again, *not* the subject of this topic.

> I'm +1 on generalizing this UI to "diacritic folding" in isearch.

By "this UI", I guess you mean that if there is a char with
a diacritic in the search string then that should turn off
char folding, preventing you from matching text ignoring

That would be unfortunate - a strict loss (inability to
match `é' against `e'; only ability to match `e' against `é'),
and with no gain.

> The other question is that of Ulrich Müller, who points out that it's
> natural for him to type his name correctly, but he'd like to laxly
> match Mueller and Muller, too.[1]

Same as my resumé example, yes.

And the use case includes various quotation marks (e.g. curly)
in the search string and wanting to match various others in
the text.  E.g., you copy some text from a web page, which
includes some curly quote marks, and you want to match text
in your buffer but ignoring the difference in quote-mark type.

Likewise, for any of the other equivalence classes.  No reason
to privilege any particular member of a class, making it so
that only that member can be used in a search string to match
the other members.  We've seen no argument supporting such

(I can imagine an argument in terms of implementation, but
we have not heard that yet.  And *no* argument has been
given in user terms - UI.  Why should users be limited wrt
which class member they can use to match a class?)

> It's a valid use case, obviously,
> but based on an analogy to experience with DWIMish case-folding in
> Emacs, I believe most users will quickly adjust to typing "muller"
> when they want a poor man's version of full "orthographic
> equivalence".  Individuals may not, but I believe the great majority
> will, since I'm sure it's anatomically easier to type "muller" than
> "Müller", even on a German keyboard.

It's not only about typing.  That seems to be the main point
that those who repeat this mantra forget.  Text can be pasted
into an Isearch string, including text copied from outside
Emacs.  Text using any Unicode chars, from any languages.

> Footnotes:
> [1]  Drew also argues this point, but from an abstract insistence on
> "symmetry", which doesn't really exist here for representational,
> anatomical, psychological reasons, and let's not forget personal
> historical reasons like "Müller is my name".

Nonsense.  I gave concrete examples.  It's not an academic
argument.  It's about really having character folding, not
just a one-way character folding that requires you to type
(or edit a pasted string) _only_ the "canonical" chars that
are folded.  It's a practical argument, not an abstract
insistence on symmetry.

Being _able_ to fold `é' to `e' or `è', and to fold one kind
of quote mark to others, is, yes, a normal use case.  Nothing
odd, abstract, or academic about it.  Herr Müller confirms
this with his own example.  This should be a no-brainer, IMO.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]