[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new-flex-completion-style

From: João Távora
Subject: Re: new-flex-completion-style
Date: Thu, 14 Feb 2019 16:12:25 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (windows-nt)

Daniel Pittman <address@hidden> writes:

>  > Feel free to add some kind of normalization to the function
>  > if you can come up with it.  Or come up with another function
>  > entirely.  As I said, a function that measures the distance
>  > in "boring editing operations" between the pattern and the target
>  > would be a nicer measure.
> The Levenshtein distance and close variants generally do a reasonable
> job of approximating human expectations, yes.  Close variants (eg:
> only delete/insert, no mutate) also work well.

`string-distance' in Emacs is the Levenshtein distance and it does a
terrible job.  it says "foo" is as close to "foobar" as it is to
"fboror", which is not acceptable.  And neither is a variant like
Damerau–Levenshtein.  I could possibly use a variant that computes
things in "elementary boring editing operations", which includes moves.

There is a fundamental misunderstanding on your side I think.  The
distance formulae exist to solve another class of problems, which
include comparing strings of characters where one is _not_ a subsequence
of the other (a subsequence being a not-necessarily-contiguous in-order
subset of elements).  Often these problems want to compute not only the
distance _and_ but also the set of insertion, deletions, transposition
that bring a string to another.  This is why they are slow.  But they do
enable fancy things like auto-correct.

Those occur in the super well-studied high-tech fields that you suggest
I'm ignorant of.  And while I'm certainly no expert on the matter, I do
think none of that matters here.  Emacs uses things like "string-match"
to tell if a string matches another.  If it doesn't it doesn't, and
that's OK for "flex".

If, for some reason, the regexp starts being slow for flex-matching,
since it writes a regexp like ".*f.*o.*o.*", then it's very easy to
optimize these flex patterns in C directly.  A greedy implementation
(which is fine) will be just O(n).

So I don't know what speed problems you are referring to.  Neither is it
clear to me what fields your "fzf" suggestion would be "blazingly fast"
(as it self-describes itself) and/or how plugging it into Emacs would
solve anything.

In short for "flex", at least for the "flex" I'm interested in, Emacs
can separate the matching and scoring problem, so your worries do not
make sense.  This "flex" style even already exists in Emacs, but only in
ido.el.  That's what I'm trying to being to other areas of Emacs.

On the other hand, if you're writing an "autocorrection" completion
style then yes, these things could start making sense.  But I'm not
writing that.

> PPS: the best modern tools seem to be derived from the problem of
> genome sequence matching, but heuristics are definitely helpful too.
> Again, fzf is the best version of that I have found.

Does it autocorrect?  If yes, it might be a good choice for that other
completion style you may be describing.  It's hardly a good choice for


reply via email to

[Prev in Thread] Current Thread [Next in Thread]