[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] RFC: arch protocol, smart server, and tla implement
From: |
Tom Lord |
Subject: |
Re: [Gnu-arch-users] RFC: arch protocol, smart server, and tla implementation prototypes |
Date: |
Sat, 31 Jan 2004 22:57:24 -0800 (PST) |
> From: Aaron Bentley <address@hidden>
>>> According to the thread, the aggregate storage requirement will
>>> be roughly twice that of the equivalent DELTA. I'd be willing
>>> to make the sacrifice of storing that data temporarily to retain
>>> the appearance of a single-threaded program.
>> I think that it varies wildly depending on the project. For
>> example, [....]
> The 50% number comes from using tla as the example, but doing it every
> 200 KiB, not every 100 revisions.
Well, consider this line of reasoning:
The _maximum_ size of a composed changeset is bound by the O(tree-size)
with a small (<< 10) constant factor. Should be obvious why.
Of course, the size of N consecutive changesets is bound by
N*O(tree-size).
Now, think about how that applies to any subset of a tree. The parts
of a composed changeset pertaining to that subtree are bound by
O(subtree-size) and the size of those parts in N consectutive
changesets is bound by N*O(subtree-size).
So, summary deltas certainly win in the worst case.
The degree to which they win in practice depends on circumstance.
They'll win most when a series of changesets being summarized displays
a lot on locality in terms of what code is being modified.
> I'm not arguing that deltas aren't good. But they *are* bigger than the
> changeset for a single revision. When deltas aren't possible, I think a
> case could be made for storing twice as much data, temporarily. But it
> depends on priorities.
I would somewhat prefer that the get_files interface in pfs.c is
designed for asynchronous, one-file-at-a-time processing. If, in
fact, pfs.c initially (and perhaps finally) buffers all N files --
that's fine.
> >(That and/or partial commits.)
> I'm working on inode snapping for partial commits right now.
I mean expanding the semantics of what kinds of partial commits are
possible --- to better approximate a cvs-like interface.
-t