[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: speedup bootstrap

From: Ralf Wildenhues
Subject: Re: speedup bootstrap
Date: Mon, 24 Oct 2005 10:57:49 +0200
User-agent: Mutt/1.5.11

Hi Gary, Peter,

* Gary V. Vaughan wrote on Mon, Oct 24, 2005 at 10:32:52AM CEST:
> Peter O'Gorman wrote:
> >Ralf Wildenhues wrote:
> >|
> >| So, let's do in some manual tail recursion elimination.
> Wow!  Great work :-D


> >| OK to apply the combined patch below?
> >
> >You know, I looked at sppeding up bootstrap for myself about a year ago, 
> >but couldn't figure out the cause.

Don't be fooled into thinking that this patch needed less than a few
months, on and off.  I don't know how experts profile m4 scripts, but I
first tried removing macros one by one (starting with a good guess) and
ended up manually #define'ing DEBUG_SYM in m4-1.4.3 together with a form
of manual statistical profiling: interrupt m4 in gdb every now and then,
look where it's at.  :)

valgrind's massif for overall memory usage profiling showed me m4 uses
little heap until it starts outputting; that's what finally got me to
get to the point that we needed to make use of storage in order to gain
on execution time.  Some off-list discussion with Stepan (thanks, BTW!)
got me on the right track.

> >This is okay, even thouh I don't actually understand it :)

Before this patch, we did this (informal notation):

  join(Separator, Arg1, Arg2, ...)
    if Arg1 last arg
      produce Arg1
    else if Arg1 empty
      call join(Separator, Arg2, Arg3, ...)
      call join(Separator, Arg1Separator, Arg2, ...)
    end if

which ends up recursively at:
    -> ...
        join(Separator, Arg1SeparatorArg2Separator...ArgN-1, ArgN)

Now we do roughly this:
  join(Separator, Arg1, Arg2, ...)
    if Arg1 nonempty
      produce Arg1
      for Arg in [Arg2, ..., ArgN]; do
        produce SeparatorArg2
      end for
      call join(Separator, Arg2, Arg3, ...)
    end if

Similarly with the other two functions, except that lt_combine actually
did a two-staged recursion, and in that recursion called lt_join in each

> No, not okay!
> I haven't tested the code yet, but I do remember that the areas you are
> touching are exceedingly delicate.  And even though on a cursory look,
> your patch looks sane, this is something to apply after 2.0!

Please rethink this judgement again.  I deliberately tried for a minimal-
invasive change; consistency-wise, it would probably be better to work
with m4 lists all the time instead of converting to and from; but my
patch actually keeps each of the three functions and their interfaces
the same.

Also, it might be much more efficient to change _some_ of the code not
to use the dictionary lookup code: a list of tagged/non-tagged variables
may easily be updated in _LT_DECL already;  however, this is not a
minimal change, either.

> On the other hand, thankyou very much for taking the time to fix this
> problem!  I'm going to add your patch to my quilt series and enjoy the
> speedup until 2.0 is done.  It'll also mean that I'll have been able to
> give it a good workout before approving.

I can only rephrase: please rethink your post-2.0 judgement.  Tell me
what you need to be convinced (test suite tests for these functions,

Please also remember that, if _you_ use this, the old version will not
be tested any more.  It might thus just as well break in between,
unnoticed, or be broken before, but breakage only uncovered later.

Surely I would like the patch to be dissected as much as possible before
applying it; I'd also love to see it given a whirl in the next alpha
release, though.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]