[bug#33899] [PATCH 0/5] Distributing substitutes over IPFS

guix-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#33899] [PATCH 0/5] Distributing substitutes over IPFS

From:	Ludovic Courtès
Subject:	[bug#33899] [PATCH 0/5] Distributing substitutes over IPFS
Date:	Fri, 18 Jan 2019 10:52:49 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Hello,

Hector Sanjuan <address@hidden> skribis:

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, January 14, 2019 2:17 PM, Ludovic Courtès <address@hidden> wrote:

[...]

>> Yes, I’m well aware of “unixfs”. The problems, as I see it, is that it
>> stores “too much” in a way (we don’t need to store the mtimes or
>> permissions; we could ignore them upon reconstruction though), and “not
>> enough” in another way (the executable bit is lost, IIUC.)
>
> Actually the only metadata that Unixfs stores is size:
> https://github.com/ipfs/go-unixfs/blob/master/pb/unixfs.proto and by all
> means the amount of metadata is negligible for the actual data stored
> and serves to give you a progress bar when you are downloading.

Yes, the format I came up with also store the size so we can eventually
display a progress bar.

> Having IPFS understand what files are part of a single item is important
> because you can pin/unpin,diff,patch all of them as a whole. Unixfs
> also takes care of handling the case where the directories need to
> be sharded because there are too many entries.

Isn’t there a way, then, to achieve the same behavior with the custom
format?  The /api/v0/add entry point has a ‘pin’ argument; I suppose we
could leave it to false except when we add the top-level “directory”
node?  Wouldn’t that give us behavior similar to that of Unixfs?

> When the user puts the single root hash in ipfs.io/ipfs/<hash>, it
> will display correctly the underlying files and the people will be
> able to navigate the actual tree with both web and cli.

Right, though that’s less important in my view.

> Note that every file added to IPFS is getting wrapped as a Unixfs
> block anyways. You are just saving some "directory" nodes by adding
> them separately.

Hmm weird.  When I do /api/v0/add, I’m really just passing a byte
vector; there’s no notion of a “file” here, AFAICS.  Or am I missing
something?

>> > It will probably need some trial an error to get the multi-part right
>> > to upload all in a single request. The Go code HTTP Clients doing
>> > this can be found at:
>> > https://github.com/ipfs/go-ipfs-files/blob/master/multifilereader.go#L96
>> > As you see, a directory part in the multipart will have the content-type 
>> > Header
>> > set to "application/x-directory". The best way to see how "abspath" etc is 
>> > set
>> > is probably to sniff an `ipfs add -r <testfolder>` operation 
>> > (localhost:5001).
>> > Once UnixFSv2 lands, you will be in a position to just drop the sexp file
>> > altogether.
>>
>> Yes, that makes sense. In the meantime, I guess we have to keep using
>> our own format.
>>
>> What are the performance implications of adding and retrieving files one
>> by one like I did? I understand we’re doing N HTTP requests to the
>> local IPFS daemon where “ipfs add -r” makes a single request, but this
>> alone can’t be much of a problem since communication is happening
>> locally. Does pinning each file separately somehow incur additional
>> overhead?
>>
>
> Yes, pinning separately is slow and incurs in overhead. Pins are stored
> in a merkle tree themselves so it involves reading, patching and saving. This
> gets quite slow when you have very large pinsets because your pins block size
> grow. Your pinset will grow very large if you do this. Additionally the
> pinning operation itself requires global lock making it more slow.

OK, I see.

> But, even if it was fast, you will not have a way to easily unpin
> anything that becomes obsolete or have an overview of to where things
> belong. It is also unlikely that a single IPFS daemon will be able to
> store everything you build, so you might find yourself using IPFS Cluster
> soon to distribute the storage across multiple nodes and then you will
> be effectively adding remotely.

Currently, ‘guix publish’ stores things as long as they are requested,
and then for the duration specified with ‘--ttl’.  I suppose we could
have similar behavior with IPFS: if an item hasn’t been requested for
the specified duration, then we unpin it.

Does that make sense?

Thanks for your help!

Ludo’.

[Prev in Thread]

Current Thread

[Next in Thread]

[bug#33899] [PATCH 0/5] Distributing substitutes over IPFS, Hector Sanjuan, 2019/01/07
- [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS, Ludovic Courtès, 2019/01/14
  - [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS, Hector Sanjuan, 2019/01/18
    - [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS, Ludovic Courtès <=
    - [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS, Hector Sanjuan, 2019/01/18

Prev by Date: [bug#34122] [PATCH 0/3] Build channel modules in the corresponding Guix
Next by Date: [bug#34122] [PATCH 2/3] inferior: 'gexp->derivation-in-inferior' honors EXP's load path.
Previous by thread: [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS
Next by thread: [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS
Index(es):
- Date
- Thread