[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parallel Merge
From: |
Ole Tange |
Subject: |
Re: Parallel Merge |
Date: |
Sat, 20 Aug 2011 16:18:38 +0200 |
On Sat, Aug 20, 2011 at 12:54 AM, Nathan Watson-Haigh
<nathan.watson-haigh@awri.com.au> wrote:
>
> What I'm actually doing is using the ABySS genome assembler. Part of the
> pipeline is:
>
> KAligher | ParseAligns | sort | DistanceEst
>
> KAligner takes sequences from one file (queries) and finds alignments agianst
> sequences in another file (targets), outputting these in Sequence
> Alignment/Map (SAM) format. ParseAligns takes the SAM format and filters out
> some alignments. It is the ParseAligns step which is slowest and I'm looking
> at how best to split up the work to make use of more cores. A job for early
> next week!
The most obvious way seems to be:
cat queries | parallel --pipe --files 'KAligher | ParseAligns |
sort' | parallel -Xj1 sort -m {}\;rm {} | DistanceEst
Would that work?
/Ole