Re: [Gnu-arch-users] Re: situations where cached revisions are not so go

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: situations where cached revisions are not so go

From:	Tom Lord
Subject:	Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Date:	Sun, 28 Sep 2003 08:40:33 -0700 (PDT)


    > From: Jason McCarty <address@hidden>

    > I should have explained my least upper bound (lub) and greatest lower
    > bound (glb) operations more completely, as I think much of your
    > criticism stems from not catching my meaning. 

Yes, and after reading your explanation I reread the algorithm and it
is crystal clear to me now.

    > That's possible. I treated cachedrevs as an afterthought when I should
    > have implemented them fully. In step 6, if a cachedrev for GOAL existed
    > you could retrieve it and return. I think it might be advantageous to
    > have a flag in ~/.arch-params/=locations for each archive that stated
    > whether to utilize cachedrevs in arch_build_revision for that archive.

Perhaps storing sizes of tar-bundles in archives starts to make a lot
of sense.

Let's suppose that while looking backwards I find a cached-rev and
know its size.   Let's suppose that whenever I look at a summary delta
or ordinary commit delta, I know _it's_ size.

Having found a cachedrev early, in the backwards search, I now have an
upper bound (in bytes of tar bundles) on how much additional backwards
searching to do looking for a changeset-path to ALMOST.

I think the =locations flag could reasonably be just "local or
remote?" and that's already there in the URL.

One way to do it is to say that the cost of a path is something like:

        10000 * kbs-to-transfer + n-changesets-to-apply

and pfs-fs.c (the local archive implementation) can (for this purpose)
always report kbs-to-transfer as 0.

There's a need for a special case or two in there: finding a tag of an
unregistered archive after having earlier found a path to an existing
ALMOST should work, not error out; there's also a prediction to take
advantage of: "If I've found a purely local path (but searching is
still continuing) but the backwards search would take me to a remote
archive -- assume that if the local path has a score < 10000 that its
not even worth connecting to the remote archive".  Maybe that
prediction can be generalized by just adding a penalty to the score
whenever a new remote archive has to be connected.


    > Part of the point of summaries (at least in my implementation) is that
    > you don't have to search the ancestry revision-by-revision. You skip
    > large parts of it, which _should_ actually result in fewer round-trips.

Yup.

    > I should give cachedrevs more consideration, 

As an initial approximation, _without_ taking the extra step of
recording sizes in archives (and to work on archives that don't have
them), could just assume that:

        sizeof of a simple-changeset == 1
        sizeof of a summary-changeset == 8
        sizeof of a cachedrev == 64

or somesuch.  And then add precise sizes at leisure.


    > >     > As you can see from my algorithm, the amount of extra code is not 
that
    > >     > large. 

    > > I don't see that yet.   Might be true -- but I don't see it yet.

    > That was hand waving on my part, sorry. We'll both find out if/when I
    > write some code.


No, I see it now.   This looks good.

-t

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, (continued)

Prev by Date: Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Next by Date: Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Previous by thread: Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Next by thread: Re: [Gnu-arch-users] Re: situations where cached revisions are not so good
Index(es):
- Date
- Thread