[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: leaky pipelines and Guix

From: Ludovic Courtès
Subject: Re: leaky pipelines and Guix
Date: Mon, 07 Mar 2016 10:54:45 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Ricardo Wurmus <address@hidden> skribis:

> Ludovic Courtès <address@hidden> writes:
>> Ricardo Wurmus <address@hidden> skribis:
>>> So, how could I package something like that?  Is packaging the wrong
>>> approach here and should I really just be using “guix environment” to
>>> prepare a suitable environment, run the pipeline, and then exit?
>> Maybe packages are the wrong abstraction here?
>> IIUC, a pipeline is really a function that takes inputs and produces
>> output(s).  So it can definitely be modeled as a derivation.
> This may be true and the basic abstraction you propose seems correct and
> useful, but I was talking about existing pipelines.  They have already
> been implemented using snakemake or make to keep track of individual
> steps, etc.  My primary concern is with making these pipelines work, not
> to rewrite them.

Oh, got it.

> For a particularly nasty pipeline I’m just using a separate profile
> just for the pipeline dependencies.  Users build the pipeline glue code
> themselves by whatever means they deem appropriate and then load the
> profile in a subshell:
>     bash
>     source /path/to/pipeline-profile/etc/profile
>     # run the pipeline here
>     exit
> I think that these existing bio pipelines should really be treated more
> like configurable packages.  For a pipeline that we’re currently working
> on I’m involved in making sure that it can be packaged and installed.
> We chose to use autoconf to substitute tool placeholders at configure
> time.  This allows us to install the pipeline easily with Guix as we can
> treat tools just as regular runtime dependencies.  At configure time the
> actual full paths to the needed tools are injected into the sources, so
> we don’t need to propagate anything and make assumptions about PATH.
> Many problems with bio pipelines stem from the fact that they are not
> treated as first-class applications, so they often don’t have a wrapper
> script, nor a configuration or installation step.  I think the easiest
> way to fix this is to encourage the design of pipelines as real software
> packages rather than distributing bland Makefiles/snakefiles and
> assuming that the user will arrange for a suitable environment.

Indeed.  Then I think if existing pipelines are shell scripts or small
programs, it makes sense to treat them as packages.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]