[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Software Heritage fifth anniversary event

From: Ludovic Courtès
Subject: Re: Software Heritage fifth anniversary event
Date: Thu, 02 Dec 2021 09:59:18 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)


Timothy Sample <> skribis:

> Ludovic Courtès <> writes:


>>   • Disarchive: they’d like to better understand the “unknowns” in the
>>     PoG plots (I wasn’t sure if it was non-tar.gz tarballs or what) and
>>     to work on the definitely-missing origins that show up there;
> Many of the unknowns are there for me to track Disarchive progress.
> It’s not really the clearest reporting, but it tracks more what Guix can
> handle automatically than what we could theoretically know about.
> Basically something is “known” if it can be downloaded from upstream,
> and either: it’s a non-recursive Git reference; or it’s something
> Disarchive can handle.  Hence, we know nothing about other version
> control systems and, say, “.tar.bz2” archives.  Also, all these things
> are based on heuristics.  :)  As we get closer to 100% known, we can
> start analyzing everything more closely.

Right.  Perhaps at some point we can give them (say on swh-devel) this
explanation so they have a clearer view of how far Disarchive is from
being “production-ready” from an SWH perspective.  Valentin of the SWH
team played a lot with pristine-tar and I’m sure they’d have useful
feedback to give.

>>     they’re not opposed to the idea of eventually hosting or maintaining
>>     the Disarchive database (in fact one of the developers thought we
>>     were hosting it in Git and that as such they were already archiving
>>     it—maybe we could go back to Git?);
> It’s a possibility, but right now I’m hopeful that the database will be
> in the care of SWH directly before too long.  I’d rather wait and see at
> this point.  I’m sure we could manage it, but the uncompressed size of
> the Disarchive specification of a Chromium tarball is 366M.  Storing all
> the XZ specifications uncompressed is over 20G.  It would be a big Git
> repo!


So, in passing, you’re telling us that xz support is kinda ready, right?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]