guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#33899] [PATCH 0/5] Distributing substitutes over IPFS


From: Hector Sanjuan
Subject: [bug#33899] [PATCH 0/5] Distributing substitutes over IPFS
Date: Fri, 18 Jan 2019 09:08:02 +0000

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, January 14, 2019 2:17 PM, Ludovic Courtès <address@hidden> wrote:

> Hi Hector,
>
> Happy new year to you too! :-)
>
> Hector Sanjuan address@hidden skribis:
>
> > 1.  The doc strings usually refer to the IPFS HTTP API as GATEWAY. go-ipfs
> >     has a read/write API (on :5001) and a read-only API that we call 
> > "gateway"
> >     and which runs on :8080. The gateway, apart from handling most of the
> >     read-only methods from the HTTP API, also handles paths like 
> > "/ipfs/<cid>"
> >     or "/ipns/<name>" gracefully, and returns an autogenerated webpage for
> >     directory-type CIDs. The gateway does not allow to "publish". Therefore 
> > I think
> >     the doc strings should say "IPFS daemon API" rather than "GATEWAY".
> >
>
> Indeed, I’ll change that.
>
> > 2.  I'm not proficient enough in schema to grasp the details of the
> >     "directory" format. If I understand it right, you keep a separate 
> > manifest
> >     object listing the directory structure, the contents and the executable 
> > bit
> >     for each. Thus, when adding a store item you add all the files 
> > separately and
> >     this manifest. And when retrieving a store item you fetch the manifest 
> > and
> >     reconstruct the tree by fetching the contents in it (and applying the
> >     executable flag). Is this correct? This works, but it can be improved:
> >
>
> That’s correct.
>
> > You can add all the files/folders in a single request. If I'm
> > reading it right, now each files is added separately (and gets pinned
> > separately). It would probably make sense to add it all in a single request,
> > letting IPFS to store the directory structure as "unixfs". You can
> > additionally add the sexp file with the dir-structure and executable flags
> > as an extra file to the root folder. This would allow to fetch the whole 
> > thing
> > with a single request too /api/v0/get?arg=<hash>. And to pin a single hash
> > recursively (and not each separately). After getting the whole thing, you
> > will need to chmod +x things accordingly.
>
> Yes, I’m well aware of “unixfs”. The problems, as I see it, is that it
> stores “too much” in a way (we don’t need to store the mtimes or
> permissions; we could ignore them upon reconstruction though), and “not
> enough” in another way (the executable bit is lost, IIUC.)

Actually the only metadata that Unixfs stores is size:
https://github.com/ipfs/go-unixfs/blob/master/pb/unixfs.proto and by all
means the amount of metadata is negligible for the actual data stored
and serves to give you a progress bar when you are downloading.

Having IPFS understand what files are part of a single item is important
because you can pin/unpin,diff,patch all of them as a whole. Unixfs
also takes care of handling the case where the directories need to
be sharded because there are too many entries. When the user
puts the single root hash in ipfs.io/ipfs/<hash>, it will display
correctly the underlying files and the people will be
able to navigate the actual tree with both web and cli. Note that
every file added to IPFS is getting wrapped as a Unixfs block
anyways. You are just saving some "directory" nodes by adding
them separately.

There is an alternative way which is using IPLD to implement a custom
block format that carries the executable bit information and nothing
else. But I don't see significant advantages at this point for the extra
work it requires.

>
> > It will probably need some trial an error to get the multi-part right
> > to upload all in a single request. The Go code HTTP Clients doing
> > this can be found at:
> > https://github.com/ipfs/go-ipfs-files/blob/master/multifilereader.go#L96
> > As you see, a directory part in the multipart will have the content-type 
> > Header
> > set to "application/x-directory". The best way to see how "abspath" etc is 
> > set
> > is probably to sniff an `ipfs add -r <testfolder>` operation 
> > (localhost:5001).
> > Once UnixFSv2 lands, you will be in a position to just drop the sexp file
> > altogether.
>
> Yes, that makes sense. In the meantime, I guess we have to keep using
> our own format.
>
> What are the performance implications of adding and retrieving files one
> by one like I did? I understand we’re doing N HTTP requests to the
> local IPFS daemon where “ipfs add -r” makes a single request, but this
> alone can’t be much of a problem since communication is happening
> locally. Does pinning each file separately somehow incur additional
> overhead?
>

Yes, pinning separately is slow and incurs in overhead. Pins are stored
in a merkle tree themselves so it involves reading, patching and saving. This
gets quite slow when you have very large pinsets because your pins block size
grow. Your pinset will grow very large if you do this. Additionally the
pinning operation itself requires global lock making it more slow.

But, even if it was fast, you will not have a way to easily unpin
anything that becomes obsolete or have an overview of to where things
belong. It is also unlikely that a single IPFS daemon will be able to
store everything you build, so you might find yourself using IPFS Cluster
soon to distribute the storage across multiple nodes and then you will
be effectively adding remotely.


> Thanks for your feedback!
>
> Ludo’.

Thanks for working on this!

Hector







reply via email to

[Prev in Thread] Current Thread [Next in Thread]