Re: Parallel Merge

parallel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parallel Merge

From:	Ole Tange
Subject:	Re: Parallel Merge
Date:	Sat, 20 Aug 2011 16:18:38 +0200

On Sat, Aug 20, 2011 at 12:54 AM, Nathan Watson-Haigh
<nathan.watson-haigh@awri.com.au> wrote:
>
> What I'm actually doing is using the ABySS genome assembler. Part of the 
> pipeline is:
>
> KAligher | ParseAligns | sort | DistanceEst
>
> KAligner takes sequences from one file (queries) and finds alignments agianst 
> sequences in another file (targets), outputting these in Sequence 
> Alignment/Map (SAM) format. ParseAligns takes the SAM format and filters out 
> some alignments. It is the ParseAligns step which is slowest and I'm looking 
> at how best to split up the work to make use of more cores. A job for early 
> next week!

The most obvious way seems to be:

  cat queries | parallel --pipe --files 'KAligher | ParseAligns |
sort' | parallel -Xj1 sort -m {}\;rm {} | DistanceEst

Would that work?

/Ole

[Prev in Thread]

Current Thread

[Next in Thread]

Parallel Merge, Nathan Watson-Haigh, 2011/08/18
- Re: Parallel Merge, Ole Tange, 2011/08/19
  - RE: Parallel Merge, Cook, Malcolm, 2011/08/19
    - RE: Parallel Merge, Nathan Watson-Haigh, 2011/08/19
  - RE: Parallel Merge, Nathan Watson-Haigh, 2011/08/19
    - Re: Parallel Merge, Ole Tange <=
    - Job Processing Was RE: Parallel Merge, Nathan Watson-Haigh, 2011/08/23
    - Re: Job Processing Was RE: Parallel Merge, Ole Tange, 2011/08/23
    - RE: Job Processing Was RE: Parallel Merge, Nathan Watson-Haigh, 2011/08/23
    - Re: Job Processing Was RE: Parallel Merge, Ole Tange, 2011/08/24

Prev by Date: RE: Parallel Merge
Next by Date: Tagging output
Previous by thread: RE: Parallel Merge
Next by thread: Job Processing Was RE: Parallel Merge
Index(es):
- Date
- Thread