bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] bug#32993: Pathologically slow operation


From: Stefan Monnier
Subject: [bug-diffutils] bug#32993: Pathologically slow operation
Date: Mon, 08 Oct 2018 17:34:18 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

I recently bumped into a `diff` operation that I killed after several
minutes while diffing two files (on 3.7GHz core i3, which is the fastest
machine I have).

These files were generated as part of Emacs's "refine-hunk" processing
which tries to do word-level diffs (by basically turning every word
into N copies of this word, each one on its own line (where N is the
number of chars in the word, used to indicate to `diff` that long words
are "more costly" than short ones)).

So the files's sizes were:

    % wc tmp/diff-bug-* 
    1038026  851160 4963190 tmp/diff-bug-1
      65041   54877  314788 tmp/diff-bug-2
    1103067  906037 5277978 total
    %

With --speed-large-files, diff still took almost a minute to return an
answer (which is 973026 lines long).

Those file aren't exactly security sensitive, but they contain personal
info that I'd rather not make public (I can make send them in private
upon request, tho).  Is there a chance this performance behavior is the
result of a performance bug, or is the algorithm really that costly?


        Stefan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]