[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Re: How new-style codeville merge works
From: |
Zack Weinberg |
Subject: |
Re: [Monotone-devel] Re: How new-style codeville merge works |
Date: |
Sun, 08 May 2005 11:00:51 -0700 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) |
Nathan Myers <address@hidden> writes:
> On Sat, May 07, 2005 at 09:29:17PM -0700, Zack Weinberg wrote:
>> I think it would be possible to make [the second half of an s.file]
>> append-only too, but that would probably hurt checkout performance.
>
> ... unless it started with the file position of the start of (the
> rest of the) metadata, or with just the fixed-size part of the
> metadata, or with just the metadata that was already known the
> last time the file had to be rewritten from the beginning, and
> the rest appended incrementally.
I don't understand what you are trying to say here.
The second half of an s.file - the database of lines - generally has
to be rewritten from scratch whenever the file content changes. This
would remain true even if it were separated from the metadata. It
happens because the second half of an s.file looks something like this
version 1 of line 1
version 2 of line 1
version 1 of line 2
version 2 of line 2
...
so whenever a line is inserted, it shoves everything after that point
down one. I wish there were a good academic paper on weave format to
refer you to, but there isn't. (The Rothkind paper on SCCS fails to
explain it very well.)
My suspicion is that you could re-sort the on-disk weave by version
number, like so
version 1 of line 1
version 1 of line 2
version 1 of line 3
...
version 2 of line 2
version 2 of line 4
...
version 3 of line 12
making it append-only, without having to add much information to the
metadata half of the file. However, operating on this would be more
complicated, hence slower. It's possible that it would be okay
performance-wise to read the whole file, rearrange it into the
conventional format in memory, then operate. It's also possible that
we don't mind rewriting this half of the file from scratch every time,
e.g. because it's stored under compression and therefore gets
rewritten from scratch every time anyway.
zw