gwl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Managing data files in workflows


From: Konrad Hinsen
Subject: Re: Managing data files in workflows
Date: Fri, 26 Mar 2021 13:46:42 +0100

Hi Simon,

> It does not answer your concrete question but instead open a new
> one. :-)

And a good one!

>  1. how to deal with data?
>  2. on which does the workflow trigger a recomputation?

Number 2 was what I had in mind with my question. And I still wonder
how GWL handles it now and/or in some near future.

> There is 3 levels:
>
>  1- the methods for fetching: URL (http or ftp), Git, IPFS, Dat, etc.
>  2- the record representing a “data”
>  3- how to effectively locally store and deal with it
>
> And if it makes sense that a ’data’ is an input of a
> ’package’, and conversely, is a question.
>
> Long time ago, with GWL folks we discussed “backend”, as git-annex or
> something else, but from my understanding, it would answer about #3 and
> what git-annex accepts as protocol would answer to #1.  Remaining #2.

Perhaps a good first step is to actually use git-annex for big files,
and then integrate it more and more into Guix and/or GWL. Multiple
backends will certainly be required in the near future, because data
storage is not yet sufficiently standardized to pick one specific
technology. So why not profit from the work already done in git-annex?

One answer to #2 would be to use a git repository. Managed by git-annex,
with remotes pointing to the repositories that actually hold the data.
Not very elegant, but as a first step, why not.

Cheers,
  Konrad.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]