[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hideously slow VC status queries fixed

From: Eric S. Raymond
Subject: Re: Hideously slow VC status queries fixed
Date: Sat, 29 Dec 2007 16:49:56 -0500
User-agent: Mutt/1.5.15+20070412 (2007-04-11)

Tom Tromey <address@hidden>:
> What this says to me is that we're simply running a lot of plain emacs
> lisp, and that regex matching is not a big problem here.

That is very valuable information, thank you.

>                                                         (I can send
> the whole list if you are interested.)

I'm not.  It looks like you're good enough at collecting and
interpreting those numbers that I don't need to be.  And, frankly, I
like having someone else do that -- you're likely to spot things I
might miss because I'm not seeing past my assumptions.

I'm going to do another rewrite of vc-dired-hook shortly, and I'd
appreciate it if you'd profile the results and compare those numbers
against the baseline you've established.

> I wonder if there is a way to do a lot less work.  For instance, could
> we have VC look only at files that are not 'up-to-date?  In my tree
> this would mean processing 24 files -- 3 orders of magnitude fewer.  I
> think this would be a pretty common result for large trees, since it
> is rare to have a patch that touches a substantial fraction of gcc.

I think that you are not quite understanding the problem here -- or at
least my assumptions about what constitutes "less work" (while
possibly incorrect) are quite different from yours.

Terminological note: by "the VC status command" I mean "svn status" or
"hg status" or whatever.  The thing that VC mode catures output from
and parses.

First, I assume that the time for the underlying VC status command to collect
its information is roughly constant, independent of the verbosity level
of the report.  It has to grovel through the same data structures 
either way -- its verbosity options only controls what it dumps to
standard out.

Second assumption: to make VC-Dired refresh quickly, the main thing we
need to avoid is lots of client/server round trips -- or, assuming
we're staying local, lots of individual status-command executions.
Either way, this pushes us towards relying on one big report from one
VC status command execution.

By contrast, I think the size of the data returned by the VC status
command, and the parsing time required for VC mode to pull that data
into Lisp-space, is much less significant.  Or, to put it a different
way, I'm assuming that the startup latency of the report generator(s)
dominates the total time from the C-x v d keystroke to the display

Under these assumption, trying to make the VC status command look at fewer
files doesn't make a lot of sense -- it won't save any executio time,
if assumption #1 is true.  And if we don't capture the complete
VC state of the tree during the dir-state call at the beginning of
vc-dired-hook, we're just going to have to do more expensive round
trips later, during the remainder of vc-dired-hook, to get the missing
information. By assumption #2 this would be a big lose.

(In fact it's doing that kind of multiple-round-trip patch-up now, which is 
what I hope to eliminate in the upcoming rewrite.)

Your report that regexp-matching time doesn't dominate appears to confirm
my theory. If you can find a way to crunch your profiling data that 
separates the latency cost of getting the report from the rest of the
interpretation time, it would be useful to compare the two.

> Thank you for working on this.  I appreciate it quite a bit.

You are welcome, and I am in turn grateful for the profiling data.  
Since I'm trying to optimize for performance, those numbers are the
most valuable guidance I can have.
                <a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]