[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: msgmerge speedup: fstrcmp and diffseq improvements

From: Bruno Haible
Subject: Re: msgmerge speedup: fstrcmp and diffseq improvements
Date: Mon, 22 Sep 2008 12:27:57 +0200
User-agent: KMail/1.5.4

Hello Ralf,

Ralf Wildenhues wrote:
> BTW, I now found a newer paper that has some more interesting stuff, if
> a bit overkill for msgmerge: <http://fastss.csg.uzh.ch/ifi-2007.02.pdf>.
> Also, there is this good overview of algorithms:
> <ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/survasm.ps.gz>.

Thanks for these pointers. The first one explains how Google's "did you mean"
feature works, but is hardly applicable here, because for a string of length
100, msgmerge is looking for similar strings with possible 30 differences
(not just 2 or 3 differences). The second one is very interesting, especially
the filtering algorithms look fancy - but also hard to put in place here
because they are good for ɑ < 0.1, whereas here ɑ can be 0.3. The algorithm
that appears to be most useful for our range of parameters is BPM (bit-
parallelized Myers). But I wonder whether it can combine with the dynamic
optimizations that we have already put in place - namely, that the lower_bound
is much higher for the later messages being compared than for the first ones
in the list.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]