gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: Potential flaw in patch-log pruning in proposal


From: John Meinel
Subject: Re: [Gnu-arch-users] Re: Potential flaw in patch-log pruning in proposal
Date: Wed, 27 Oct 2004 23:08:15 -0500
User-agent: Mozilla Thunderbird 0.8 (Windows/20040913)

James Blackwell wrote:
John Meinel wrote:


I think the idea that the patch name gets kept, but you can optionally keep the patch-log itself is a good idea. So the '{arch}/=merged' file would contain the equivalent of
tla logs --merges


There's a catch to doing this. I've meant to (and put off for) quite a
bit to build a command that lets you easily see in which revisions a
"file" was changed.

Doesn't fai have this with fai revisions --modified filename?

Basically you just go back and parse the patch logs, right.


Or possibly with the summaries as well. (This isn't as important, and might cause file size to go up again.)

Then when tla wants to check if a patch has already been merged, it can look for the patch-log, and secondly look for the entry in =merged.


Aye. I think this would be "good enough" for replay --skip-present.

Sure, and I think that's what we would be stuck with. But remember, you still have the patch log on the main branch (whatever the name of the merge is). So you can still say when the official tree was modified. And if you keep something like
tla logs --merges

Then you can also see what other revisions contributed to it. You don't know exactly which did what, but you do keep context sensitive, and you can locate it relative to just a couple packages.


It might be nicer if =merged was some sort of indexed file format, so that lookups could be faster, but greping even a 120K file doesn't seem very expensive.


Though I'm in favor of this, actually it can get to be quite a bit more
expensive than this. Though it wouldn't be this bad for small/medium
sized projects, larger heavily developed projects will see a
concatenated patch-log on the order of 5-6 megabytes per working
copy/revision in the library.


I don't think =merged is going to be all the patch logs "cat"ed together. It's supposed to be just the fully qualified revision name.

I believe that's what Matthieu stated. The sum of all the patch logs was > 7 MB for all of the logs, but for just the list of revisions it is only 120Kb.

I was assuming xtla is at least medium sized. Perhaps you are thinking more along the lines of a "gcc". I'm not really sure how to scale that big.

One other possibility is to use ".zip" files instead of tarballs. zlib has native support for them as of something like 1.1 or 1.2. Basically they store each file compressed individually, with an index to find each one. You get better compression with a tarball because you can compress larger chunks, but you don't get indexing.

.zip has traditionally been a windows thing, but since zlib supports it (and python started using it as an alternative to storing your scripts in individual files). I see it as becoming useful in more places.

7zip is an interesting compromise. You have an index, but only to the chunk level, where each chunk could be more than one file.

But really, for something like patch logs, all you need is one indexed file that lets you quickly find if one exists, and then you can pack the actual text of the log however you want.
John
=:->

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]