[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What do you use GNU Parallel for?

From: Matt Oates (Home)
Subject: Re: What do you use GNU Parallel for?
Date: Wed, 22 Aug 2012 09:19:42 +0100

Hi Ole,

On 22 August 2012 07:08, Ole Tange <> wrote:
> So please write a few lines about the tasks you use it for -
> especially if you have reason to believe you are one of the few doing
> that kind of thing. If you want to be anonymous you can write me
> directly, but otherwise use the mailing list.

Good luck with the talk!

I use parallel to parallelise the external loop of most Bioinformatics
software, especially HMMER3. Many pieces of software have no
parallelisation, so if I give a big long list of input they go through
serially. I work with quite large datasets, 1,765 genomes each having
1-10 thousand protein sequences. With 5x 24 core desktops I can really
cutback how long something takes. We even have an internal script that
bridges parallel with the EC2 compute cloud, so if I need to do
something extra big I just go wider and hand the list of EC2 machine
names to parallel.

More day to day, I frequently use parallel to transform large files
(hundreds of gigabytes per file) of data between text based file
formats, so parallel perl/sed. I use the --pipe feature a lot to split
files too, so something like the FASTA format is splitable with
parallel and I can pipe the data straight in to another program.

I think you would do well to perhaps publish a short paper somewhere
in the Bioinformatic field about the speed ups you can get using
parallel with older non-parallel software.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]