[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Character group folding in searches
From: |
Eli Zaretskii |
Subject: |
Re: Character group folding in searches |
Date: |
Mon, 09 Feb 2015 17:40:44 +0200 |
> From: Stefan Monnier <address@hidden>
> Cc: address@hidden, address@hidden
> Date: Sun, 08 Feb 2015 22:03:08 -0500
>
> > Char-tables are efficient, and at least for decomposition they seem to
> > be the perfect vehicle. DFAs that come out of arbitrary regexps,
> > OTOH, can sometimes be very inefficient. That's why I tend to think
> > about this in terms of char-tables.
>
> That's a false dichotomy.
Actually, it's not a dichotomy at all. I just explained why
char-tables seem to be a good basis on which to build this feature.
> DFA is about *recognizing* multi-char entities. If the input
> entities you care about are only single-char (as is the case for
> decomposition), then your DFA will degenerate to a single char-table
> (as is the case now).
I think we have a miscommunication here. I was talking about the
tables that are part of a DFA that drive its state machine. Those
tables might become large and sparse, certainly if the input symbol
can be any Unicode character, most of which only match themselves.
I guess I'm still struggling to understand your idea of using DFAs.
E.g., you talk about each node of a DFA being a char-table, but AFAIK
a DFA node is just a state of the automaton, so how can that be
expressed as a char-table? And above you are saying that a "DFA will
degenerate to a single char-table", which again is a stumbling block
for me, since a DFA is more than a table. What am I missing?
> But how do you use current char-tables to handle multi-char input
> entities (i.e. to recognize things like "=>")?
I don't understand the question, sorry. The simple answer is that a
char-table entry can be any Lisp object, including a string, but you
already know that.
If you mean how to compare "=>" with "⇒", then the latter will be
"folded" to the former using a char-table, and then the results will
be compared, either as strings or character by character. Is this
what you were asking?
> > Who and how will create such a DFA?
>
> They'd be mechanically constructed (by hand-written code), for example
> driven by the existing Unicode tables.
What would be the input language for specifying such a DFA? I mean,
how would we specify which sequence of states are acceptable (yielding
a match for the search) and which aren't?
- Re: Character group folding in searches, (continued)
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/07
- Re: Character group folding in searches, Stefan Monnier, 2015/02/06
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/06
- Re: Character group folding in searches, Stefan Monnier, 2015/02/06
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/07
- Re: Character group folding in searches, Stefan Monnier, 2015/02/07
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/07
- Re: Character group folding in searches, Stefan Monnier, 2015/02/08
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/08
- Re: Character group folding in searches, Stefan Monnier, 2015/02/08
- Re: Character group folding in searches,
Eli Zaretskii <=
- Re: Character group folding in searches, Stefan Monnier, 2015/02/09
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/09
- Re: Character group folding in searches, Stefan Monnier, 2015/02/09
- Re: Character group folding in searches, Eli Zaretskii, 2015/02/10
Re: Character group folding in searches, Juri Linkov, 2015/02/06