parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Any tips about parallel and sem when using intensive I/O operations?


From: Ole Tange
Subject: Re: Any tips about parallel and sem when using intensive I/O operations?
Date: Thu, 15 Mar 2012 20:41:57 +0100

On Thu, Mar 15, 2012 at 7:10 PM, ningyi shao <shaoningyi@gmail.com> wrote:
> Now I am using parallel (in fact, sem) to run samtools and other next
> generation sequencing analysis.
> Some things are quite similar as this blog described:
> http://zvfak.blogspot.com/2012/02/samtools-in-parallel.html
> But I like use sem in such way:
>
>> export PRO="${HOME}/projects/2012-03-09_H3K4me3"
>> export RESULT="${PRO}/result/ngs.plot/2012-03-15"
>> export DATA="${PRO}/data/reheader"
>> mkdir -p ${RESULT}
>>
>> INPUTS=("Sample_H" "Sample_G")
>>
>> # setup tagdirectory of inputs
>> for INPUT in ${INPUTS[@]};do
>>     sem -j4 samtools rmdup -s ${DATA}/${INPUT}.bam
>> ${RESULT}/${INPUT}_rmdup.bam
>> done
>>
>> TREATS=("Sample_D" "Sample_E" "Sample_F")
>> for TREAT in ${TREATS[@]};do
>>     sem -j4 samtools rmdup -s ${DATA}/${TREAT}.bam
>> ${RESULT}/${TREAT}_rmdup.bam
>> done
>> sem -w

Do you get the same problem with:

parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam :::
"${INPUTS[@]}"

parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam :::
"${TREATS[@]}"

or even:

parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam :::
"${INPUTS[@]}" "${TREATS[@]}"

> But I met some problems as when the load of the server heavy, then the
> output of the sem sometimes will lose output randomly.

As you can imagine it is hard for others to reproduce that problem.
That is why the man page says:

       Your bug report should always include:

       · The output of parallel --version. If you are not
         running the latest released version you should specify
         why you believe the problem is not fixed in that
         version.

       · A complete example that others can run that shows the
         problem. A combination of seq, cat, echo, and sleep can
         reproduce most errors. If your example requires large
         files, see if you can make them by something like seq
         1000000 >file.

       If you suspect the error is dependent on your
       distribution, please see if you can reproduce the error
       on one of these VirtualBox images:
       http://sourceforge.net/projects/virtualboximage/files/
       Specifying the name of your distribution is not enough as
       you may have installed software that is not the the
       VirtualBox images.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]