Re: Hideously slow VC status queries fixed

From: Tom Tromey
Subject: Re: Hideously slow VC status queries fixed
Date: Sat, 29 Dec 2007 12:18:40 -0700
>>>>> "Eric" == Eric S Raymond <address@hidden> writes:

Eric> I don't think the bottlneck is VC anymore.  In the normal cases (CVS, 
Eric> SVN, Mercurial), VC now generates just one (1) command.  I suppose
Eric> parsing time could be an issue -- has anyone profiled 26,499 regexp
Eric> matches lately ?

I looked at this a little bit.

Here's the top of the elp output (any function after this took less
than 1 second elapsed time).  I used elp-instrument-package with "vc-"
as the argument (I never used elp before; let me know if I did
something wrong).

Function Name                        Call Count  Elapsed Time  Average Time
===================================  ==========  ============  ============
vc-call-backend                      346         31.258875999  0.0903435722
vc-backend                           25650       15.846656999  0.0006178033
vc-registered                        34          8.9951210000  0.2645623823
vc-stay-local-p                      1           7.837376      7.837376
vc-file-getprop                      51331       7.2145619999  0.0001405498
vc-file-setprop                      238371      6.6009059999  2.769...e-05
vc-state                             25613       2.4212969999  9.453...e-05

Interestingly, vc-svn doesn't show up here at all.

I also looked at this with oprofile.  This is interesting too:

samples  %        image name               symbol name
1695163  48.2143  emacs                    Fbyte_code
295864    8.4150  emacs                    arithcompare
238998    6.7976  emacs                    mark_object

Regular expression matching comes in at #13:

31889     0.9070  emacs                    re_match_2_internal

What this says to me is that we're simply running a lot of plain emacs
lisp, and that regex matching is not a big problem here.  (I can send
the whole list if you are interested.)

I wonder if there is a way to do a lot less work.  For instance, could
we have VC look only at files that are not 'up-to-date?  In my tree
this would mean processing 24 files -- 3 orders of magnitude fewer.  I
think this would be a pretty common result for large trees, since it
is rare to have a patch that touches a substantial fraction of gcc.

Thank you for working on this.  I appreciate it quite a bit.


