From MAILER-DAEMON Tue Mar 06 17:16:01 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S52fp-0005Um-SX for mharc-parallel@gnu.org; Tue, 06 Mar 2012 17:16:01 -0500 Received: from eggs.gnu.org ([208.118.235.92]:38065) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S52fi-0005Uf-NX for parallel@gnu.org; Tue, 06 Mar 2012 17:16:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S52fh-0007VR-1x for parallel@gnu.org; Tue, 06 Mar 2012 17:15:54 -0500 Received: from nm3-vm0.bullet.mail.bf1.yahoo.com ([98.139.212.154]:34931) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1S52fg-0007VK-Q4 for parallel@gnu.org; Tue, 06 Mar 2012 17:15:52 -0500 Received: from [98.139.212.151] by nm3.bullet.mail.bf1.yahoo.com with NNFMP; 06 Mar 2012 22:15:51 -0000 Received: from [98.139.212.251] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 06 Mar 2012 22:15:51 -0000 Received: from [127.0.0.1] by omp1060.mail.bf1.yahoo.com with NNFMP; 06 Mar 2012 22:15:51 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 355304.73954.bm@omp1060.mail.bf1.yahoo.com Received: (qmail 19835 invoked by uid 60001); 6 Mar 2012 22:15:51 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1331072151; bh=VhiPXIMKlUeC4twH8RoKBrTp5YfGE+Hk7b4eaHXz4tc=; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=Cj3b4HjU7A6bs5kS/cxGoaEJK2vCbTS5k9uParB87rgzzaj/r94mAtHvpAHy5VN98SXP7Q6lM3LT5o2VKGthZlzv8tuU6nAYvpBNBZyOclud511AlfMYznHehyXz9yX3PP850PhI/f3gLzC/imtQZyDsxg/bqZtQrii8mP5WMlQ= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=ZssBs1DcHZKag1U2/LRsRL+ho6DZh4+3uxM4KAPPrWwuBRxDht6gQlpRqkPfDDf0B5fqCpwHlAK+Kletw1Nl1wOqqADreSPim0D6irSXK6Cry/uP+AjrKj9cULwOVfoEJLrjHJFR/Q/DY2v9c7eBzfxKcrHbldeiIrADAcyc2BI=; X-YMail-OSG: JsEuk9oVM1kseFNxOXOqkI_8JS2Tu4Ah1Qy4V2fK_iGtW6t 595tp.Id1Am9fM7_1epFuBo9yYkCXmMbxGBFSsSAl75WAlv3UNl8PkoRdl5p Hc5pIDsjEVqAi29ISE4vJPimSpMs68y_mUgar27iOxuGRDOXWctLMJxEEkQL 2AzzyMkoOjV9AujLxBUtqgYTqAaxx4nWWfyMHS2xZiBcMLnQmdZSoYMzdGkf yySFPL4u9T8UtNCcdZm7N1U429ev.q3fBKxbY2zvC92Gtfvx4jpgA7VF2PTj RLcNoCH2t0.h1gMTpClx7h1yImQs9629PyWX1UjDM1qB30Ef8FGhUzzAuOh3 2kbDTHbzedifDRzPTDOg9SsbsUhrjC4xfI0gAy64DePdQf05GGKWYhmeunX0 9.s..YtmUQlYnQ9Xw71BJfprU.LmyBW31JxeOjs9TcS_q_9ovcdsWrj8Ord5 xGz72TGceyBBiK99eTtvGJnrtpKjTtL1TLyz04m7JnpDq82D1ZA-- Received: from [77.99.105.151] by web161303.mail.bf1.yahoo.com via HTTP; Tue, 06 Mar 2012 14:15:51 PST X-Mailer: YahooMailClassic/15.0.5 YahooMailWebService/0.8.116.338427 Message-ID: <1331072151.17936.YahooMailClassic@web161303.mail.bf1.yahoo.com> Date: Tue, 6 Mar 2012 14:15:51 -0800 (PST) From: yacob sen Subject: Word too long ?? To: parallel@gnu.org In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-1399187143-1375759389-1331072151=:17936" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 98.139.212.154 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Mar 2012 22:16:00 -0000 ---1399187143-1375759389-1331072151=:17936 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi All I was trying to run a GNU parallel script that was run on my Ubuntu 10.10 v= ersion with no problems. Today I upgrade my linux OS =A0to Ubuntu version 1= 1.04 and tried to run the same GNU parallel script but I=A0received=A0the e= rror message as: "Computer:jobs running/jobs completed/%of started jobs/Average seconds to c= ompleteWord too long." Why is this "word too long comes from" ? Kind regards Yacob ---1399187143-1375759389-1331072151=:17936 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable =

Hi A= ll

I was trying to run a GNU parallel script = that was run on my Ubuntu 10.10 version with no problems. Today I upgrade m= y linux OS  to Ubuntu version 11.04 and tried to run the same GNU para= llel script but I received the error message as:
"Computer:jobs running/jobs completed/%of start= ed jobs/Average seconds to complete
Word too long."

Why is this "word too long comes from" ?

Kind = regards

Yacob
---1399187143-1375759389-1331072151=:17936-- From MAILER-DAEMON Fri Mar 09 04:37:48 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S5wGi-0000qJ-Li for mharc-parallel@gnu.org; Fri, 09 Mar 2012 04:37:48 -0500 Received: from eggs.gnu.org ([208.118.235.92]:38806) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5wGf-0000pU-8F for parallel@gnu.org; Fri, 09 Mar 2012 04:37:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S5wGY-0005kN-UW for parallel@gnu.org; Fri, 09 Mar 2012 04:37:44 -0500 Received: from mail-pw0-f41.google.com ([209.85.160.41]:36556) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5wGY-0005kA-Lf for parallel@gnu.org; Fri, 09 Mar 2012 04:37:38 -0500 Received: by pbcup15 with SMTP id up15so2530117pbc.0 for ; Fri, 09 Mar 2012 01:37:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=Pmf8F9c++KTVbepWgQ/ErkXUIVNaHtiJyvo3+p1iAiQ=; b=raZEDfn3YtisTkOqa1wFKCv5pVtFOClbrD+vfLB6W2LYjAQzqKEjzBn+lu7oKaaLZF Q5UdIrTGvduHIfoqThN8JYNuGhWI8LvFLcpsxu/OrV93rCUvhAJZHC4ZlafQV9/Ox9bN uXoDE0KmJ/0QY8GNWDTEa97Dw8kTTEILSFER2/MIQ2CHzzXPNnTVzR7Pzrx+baulj+7F zTVUiZ9qKpOBHp66huQpFOrlDBv2Scme95H/Tfe7z6KFyAarlG/dtMMaCtPjJvhjulwl SJ31c6kyMQSXXoNzq08I515cz1Vj9FLyEO6MURKSNMt8IBehMUetx34gwYY0C15I6lvs Q81w== Received: by 10.68.241.2 with SMTP id we2mr2436579pbc.53.1331285855153; Fri, 09 Mar 2012 01:37:35 -0800 (PST) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Fri, 9 Mar 2012 01:37:15 -0800 (PST) In-Reply-To: <1331072151.17936.YahooMailClassic@web161303.mail.bf1.yahoo.com> References: <1331072151.17936.YahooMailClassic@web161303.mail.bf1.yahoo.com> From: Ole Tange Date: Fri, 9 Mar 2012 10:37:15 +0100 X-Google-Sender-Auth: VpMWwXb9AF00FokSuylXvTHrc9w Message-ID: Subject: Re: Word too long ?? To: yacob sen Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Mar 2012 09:37:47 -0000 On Tue, Mar 6, 2012 at 11:15 PM, yacob sen wrote: > > I was trying to run a GNU parallel script that was run on my Ubuntu 10.10 > version with no problems. Today I upgrade my linux OS =A0to Ubuntu versio= n > 11.04 and tried to run the same GNU parallel script but I=A0received=A0th= e error > message as: > > "Computer:jobs running/jobs completed/%of started jobs/Average seconds to > complete > Word too long." > > Why is this "word too long comes from" ? Yeah that is weird. I cannot reproduce the error. Report bugs to or https://savannah.gnu.org/bugs/?func=3Dadditem&group=3Dparallel Your bug report should always include: * The output of parallel --version. If you are not running the latest released version you should specify why you believe the problem is not fixed in that version. * A complete example that others can run that shows the problem. A combination of seq, cat, echo, and sleep can reproduce most errors. If you example requires large files, see if you can use make them by something like seq 1000000>file. If you suspect the error is dependent on your distribution, please see if you can reproduce the error on one of these distibutions: http://sourceforge.net/projects/virtualboximage/files/ /Ole From MAILER-DAEMON Fri Mar 09 05:36:35 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S5xBb-0004gL-RD for mharc-parallel@gnu.org; Fri, 09 Mar 2012 05:36:35 -0500 Received: from eggs.gnu.org ([208.118.235.92]:44989) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S5xBU-0004gC-8W for parallel@gnu.org; Fri, 09 Mar 2012 05:36:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S5xBN-0000Ne-Qd for parallel@gnu.org; Fri, 09 Mar 2012 05:36:27 -0500 Received: from nm32-vm5.bullet.mail.bf1.yahoo.com ([72.30.239.141]:34406) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1S5xBN-0000NM-Hq for parallel@gnu.org; Fri, 09 Mar 2012 05:36:21 -0500 Received: from [98.139.215.140] by nm32.bullet.mail.bf1.yahoo.com with NNFMP; 09 Mar 2012 10:36:19 -0000 Received: from [98.139.212.215] by tm11.bullet.mail.bf1.yahoo.com with NNFMP; 09 Mar 2012 10:36:19 -0000 Received: from [127.0.0.1] by omp1024.mail.bf1.yahoo.com with NNFMP; 09 Mar 2012 10:36:19 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 570324.98390.bm@omp1024.mail.bf1.yahoo.com Received: (qmail 65846 invoked by uid 60001); 9 Mar 2012 10:36:19 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1331289379; bh=o3jyYFU0oKIDptclwlRBIBPyDkZDV5KeHWwuKRdzBkg=; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=mmK8gPBj7L3eeDxjf8jMAZUEkDvhEdW/NuFF5CETggum2B7CcaUz/Z+9/Zwe0gQQoq5EeTa9WYYLNX+SEbuFmog0GZKn79gfGNeWzNvOaE2k3g5lCWeKR3ts5+U4DtqRGGUeZA7eoM24xhGqbCCvVY15pFMBRAhfwmeGHdx3ip4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=pt+E5TFQwgYq6a3T9TN3RKWRVaFDdumYqILnj6Z2Zeu//tgvI8Om9cLX2h4/8i7VzcragdypJ5vhmc9pBnfLQ+O2EA39BNL4L/PA1bUpDrNh8H4f/5V1Lu0jWypIBBAfmbcIEHNQ/sNY3wGnQaINiDToirHwmzWE1/8XhcnsGqk=; X-YMail-OSG: sR9nnpQVM1lVE6UYK_tD86QM7o2OJvdcbhC3dgOiQtwlHks vK_br3dQPnx.ce6PapiDS.XtGcOkwy5LfEaqhNb28RJfLasVNbi8sULJArJt 5z_nt_2lWU1w0Q8YZhnW2fB2SkuK9Uh6eHjAUJISgOIgSyJ2DE.thjnawzCq tAuhhmCveOu_.SW0jEGgZqbqzl68Rtp_y3jxAoFRqLgKWOydO3TLtOuQWljR plUZrFxiXXtHqTuXQWvUiqGinW1l4fvKO7bZreRQNPiiPJgQHO5PqYWTGE3D EeWfgpqpXxPle3l4QO4juKk58KJOeTfEjMfu2mMql9iRQKVQmRdHdJyuL7GD gKGXzA_5xN9J.jBPql_F_jhoI0uwxU3p6zpAm_Q1ucm67dANiFDNzG7QReHY rb0Mnis2JmvHAq1aS5H5nYAzLg0WgVPPyWcvL5Ebc._nGKXpgdERlT_h_6Ap vuoFirkJ_94UH.rkEOsWVc620kTLecBlQ0UZbCrftcKn9eXccG17yHK.0804 uiHqXbR_Z6Dpn9crM_HMlXs8chW4UqeNYK3vKInrNHRGjPeKGKvVVDrPsPZ1 ZqQp4r83KfR1atdvM245ezEgQi0uid8c7ZkA- Received: from [129.215.6.207] by web161305.mail.bf1.yahoo.com via HTTP; Fri, 09 Mar 2012 02:36:19 PST X-Mailer: YahooMailClassic/15.0.5 YahooMailWebService/0.8.116.338427 Message-ID: <1331289379.55383.YahooMailClassic@web161305.mail.bf1.yahoo.com> Date: Fri, 9 Mar 2012 02:36:19 -0800 (PST) From: yacob sen Subject: Re: Word too long ?? To: Ole Tange In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1854988078-1108701542-1331289379=:55383" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 72.30.239.141 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Mar 2012 10:36:34 -0000 --1854988078-1108701542-1331289379=:55383 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Dear Ole Thank you for your reply. It is indeed weird. I perhaps have to downgrade m= y ubuntu from 11.10 to 10.10 for my =A0Gnu parallel to work. I'm using the = latest version of GNU parallel. But still get the same result. Computers / CPU cores / Max jobs to run1:local / 4 / 4 Computer:jobs running/jobs completed/%of started jobs/Average seconds to co= mpleteWord too long.then it stops. Kind regardsYacob --- On Fri, 9/3/12, Ole Tange wrote: From: Ole Tange Subject: Re: Word too long ?? To: "yacob sen" Cc: parallel@gnu.org Date: Friday, 9 March, 2012, 10:37 On Tue, Mar 6, 2012 at 11:15 PM, yacob sen wrote: > > I was trying to run a GNU parallel script that was run on my Ubuntu 10.10 > version with no problems. Today I upgrade my linux OS =A0to Ubuntu versio= n > 11.04 and tried to run the same GNU parallel script but I=A0received=A0th= e error > message as: > > "Computer:jobs running/jobs completed/%of started jobs/Average seconds to > complete > Word too long." > > Why is this "word too long comes from" ? Yeah that is weird. I cannot reproduce the error. Report bugs to or https://savannah.gnu.org/bugs/?func=3Dadditem&group=3Dparallel Your bug report should always include: * The output of parallel --version. If you are not running the latest released version you should specify why you believe the problem is not fixed in that version. * A complete example that others can run that shows the problem. A combination of seq, cat, echo, and sleep can reproduce most errors. If you example requires large files, see if you can use make them by something like seq 1000000>file. If you suspect the error is dependent on your distribution, please see if you can reproduce the error on one of these distibutions: http://sourceforge.net/projects/virtualboximage/files/ /Ole --1854988078-1108701542-1331289379=:55383 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable

Dear= Ole

Thank you for your reply= . It is indeed weird. I perhaps have to downgrade my ubuntu from 11.10 to 1= 0.10 for my  Gnu parallel to work. I'm using the latest version of GNU= parallel. But still get the same result.

Com= puters / CPU cores / Max jobs to run
1:local / 4 / 4

Computer:jobs runnin= g/jobs completed/%of started jobs/Average seconds to complete
<= div>Word too long.
then it stops.

Y= acob

--- On Fri, 9/3/12, Ole Tange <tange@gnu= .org> wrote:

From: Ole Tange <tange@gnu.org>
Subje= ct: Re: Word too long ??
To: "yacob sen" <yacob_123@yahoo.com>
= Cc: parallel@gnu.org
Date: Friday, 9 March, 2012, 10:37

On Tue, Mar 6, 2012 at 11:15 PM, yacob sen <yacob_123@yahoo.com> wrote:
>
> I was trying = to run a GNU parallel script that was run on my Ubuntu 10.10
> versio= n with no problems. Today I upgrade my linux OS  to Ubuntu version
= > 11.04 and tried to run the same GNU parallel script but I receive= d the error
> message as:
>
> "Computer:jobs running= /jobs completed/%of started jobs/Average seconds to
> complete
>= ; Word too long."
>
> Why is this "word too long comes from" ?<= br>
Yeah that is weird. I cannot reproduce the error.

Report bugs= to <bug-parallel@gnu.org> or
https://savannah.gnu.org/bugs/?func=3Dadditem&group=3Dparallel

Your bug report should always include:

* The output of parallel --version. I= f you are not running the latest
released version you should specify why= you believe the problem is not
fixed in that version.

* A comple= te example that others can run that shows the problem. A
combination of = seq, cat, echo, and sleep can reproduce most errors. If
you example requ= ires large files, see if you can use make them by
something like seq 100= 0000>file.

If you suspect the error is dependent on your distribu= tion, please see
if you can reproduce the error on one of these distibut= ions:
http://sourceforge.net/projects/virtualboximage/files/

/Ole

--1854988078-1108701542-1331289379=:55383-- From MAILER-DAEMON Fri Mar 09 18:18:39 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S6955-0007ZP-Cu for mharc-parallel@gnu.org; Fri, 09 Mar 2012 18:18:39 -0500 Received: from eggs.gnu.org ([208.118.235.92]:38286) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S6952-0007YV-HG for parallel@gnu.org; Fri, 09 Mar 2012 18:18:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S6950-00011p-Nt for parallel@gnu.org; Fri, 09 Mar 2012 18:18:36 -0500 Received: from mail-pz0-f41.google.com ([209.85.210.41]:37688) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S6950-00011L-FT for parallel@gnu.org; Fri, 09 Mar 2012 18:18:34 -0500 Received: by dadv6 with SMTP id v6so2111579dad.0 for ; Fri, 09 Mar 2012 15:18:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=1WONH4cmwDYF4duKC9m3Og4PoiIo5kfux+QWouIHElI=; b=WbF1FSJUWIjIypLDQYk6NI1RxHjdsM0fRbHTVxMshAaks4iF5UdVx4VjPSjkmLNTys JRZEXMM0t0Eg42J4f2P9/h0lIQvP/vN2bK7MXExQiYTe5GUSp0HRtbib8q5s0fWeQB+x Te8z6xd6KXVyh2Kg5mxBh4ren2NNryqJDzIqHzA8hM40TuLir0HOCW3GuKkJu6AOKQeZ m2lDkzcgEoFP82QXW/Mf9DUi5vTwLp+I2JfW3yyDLfjvFIY10oAdAERQfGYj8Nqsz6mb wQPyuLHK7bR68YqZzgSW2ApvEhCdQ7Q3AwCzj4qNxban+HxNHBX70hXt7h+mMiJ2Ruge pt5Q== Received: by 10.68.225.194 with SMTP id rm2mr7061084pbc.95.1331335112352; Fri, 09 Mar 2012 15:18:32 -0800 (PST) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Fri, 9 Mar 2012 15:18:12 -0800 (PST) In-Reply-To: <1331289379.55383.YahooMailClassic@web161305.mail.bf1.yahoo.com> References: <1331289379.55383.YahooMailClassic@web161305.mail.bf1.yahoo.com> From: Ole Tange Date: Sat, 10 Mar 2012 00:18:12 +0100 X-Google-Sender-Auth: ewDkRebtE7ND70mZkge82QUgYIk Message-ID: Subject: Re: Word too long ?? To: yacob sen Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Mar 2012 23:18:38 -0000 On Fri, Mar 9, 2012 at 11:36 AM, yacob sen wrote: > > Thank you for your reply. It is indeed weird. I perhaps have to downgrade > my ubuntu from 11.10 to 10.10 for my =A0Gnu parallel to work. I'm using t= he > latest version of GNU parallel. There are (at least) 5 different "latest" versions of GNU Parallel: * the one that only exists on my laptop * the latest in GIT * the latest alpha-release * the latest (normal) release * the latest stable release That is why the man page tells you to ALWAYS include: * The output of parallel --version. If you are not running the latest released version you should specify why you believe the problem is not fixed in that version. * A complete example that others can run that shows the problem. A combination of seq, cat, echo, and sleep can reproduce most errors. If you example requires large files, see if you can use make them by something like seq 1000000>file. If you suspect the error is dependent on your distribution, please see if you can reproduce the error on one of these VirtualBox images: http://sourceforge.net/projects/virtualboximage/files/ Specifying the name of your distribution is not enough as you may have installed software that is not the the VirtualBox images. If you do not give me enough information to be able to reproduce the error, I will usually ignore the bug report. /Ole From MAILER-DAEMON Sun Mar 11 09:57:32 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S6jHA-0007GA-0U for mharc-parallel@gnu.org; Sun, 11 Mar 2012 09:57:32 -0400 Received: from eggs.gnu.org ([208.118.235.92]:32972) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S6WLn-0004Wi-TK for parallel@gnu.org; Sat, 10 Mar 2012 19:09:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S6WLl-0004sw-RV for parallel@gnu.org; Sat, 10 Mar 2012 19:09:27 -0500 Received: from cxe.ucsd.edu ([137.110.243.111]:60140) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S6WLl-0004sg-Km for parallel@gnu.org; Sat, 10 Mar 2012 19:09:25 -0500 Received: by cxe.ucsd.edu (Postfix, from userid 500) id 42DDAE02E2; Sat, 10 Mar 2012 16:09:21 -0800 (PST) Date: Sat, 10 Mar 2012 16:09:21 -0800 From: Chris X Edwards To: parallel@gnu.org Subject: silly parallel bug Message-ID: <20120311000921.GP30453@ucsd.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 137.110.243.111 X-Mailman-Approved-At: Sun, 11 Mar 2012 09:57:30 -0400 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Mar 2012 00:09:29 -0000 Hey, I'm using parallel when I can and spreading the good word about its wholesome goodness. But one of my users found a silly bug recently. It's almost not a bug. But maybe the bug was that the program says there's a bug when there really isn't. Basically, you can't use sem on a readonly system. Duh, right? ./dockInitPara.sh: line 18: dockInitPara.log: Read-only file system parallel: This should not happen. You have found a bug. Please contact and include: * The version number: 20110322 * The bugid: Can't open semaphore file /home/ak/.parallel/semaphores/id-dockInitPara.lock: Read-only file system Just thought you might go ahead and check for that and take it out of the bug category. Maybe provide an error message reminding them of the mount status of the working directory (or questioning the user's competence). Etc. Best wishes, Chris -- Chris X Edwards - xed.name/ucsd From MAILER-DAEMON Mon Mar 12 18:24:28 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S7DfI-0002pV-2D for mharc-parallel@gnu.org; Mon, 12 Mar 2012 18:24:28 -0400 Received: from eggs.gnu.org ([208.118.235.92]:35953) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S7DfD-0002mn-JB for parallel@gnu.org; Mon, 12 Mar 2012 18:24:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S7Df8-0000cU-R9 for parallel@gnu.org; Mon, 12 Mar 2012 18:24:23 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:57210) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S7Df8-0000ZZ-Ie for parallel@gnu.org; Mon, 12 Mar 2012 18:24:18 -0400 Received: by dadv6 with SMTP id v6so6026489dad.0 for ; Mon, 12 Mar 2012 15:24:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=e6yEZc6eq3pKqByzskHuMZ/xTQmjEvkTFB0R6/FksUc=; b=hUHqPTOQc86wjFbckG4ShUfYa7oXJo9P5DkkPG/9de03szZFUv7FJjaM0+d5XaMBwt qsO1KG/2MkAS92rU781VD0D5x9GudsGutiuUYQ3i/JsryC/de8vHFrZS7Du59Hhib5vP tX+9UUo1hvKboB6Fdj+UoMCybmN/CX2M4jAFFBGa6i9oz/UBXovmtDxSZdI6a29GX0sg ZdGu1j5YhCgV4O4qqjCx1qlmIXzUpCKlwPnNoHm25uRjysjhc2oEH/xN+wqriRWpG8mq XOW2fhoWUiwqZtdAdSd+BdEFBKCgRSg3LZ9T5A0LvY7G21fpGmaXwydPRX/+6RujzSZ4 ZGpQ== Received: by 10.68.203.74 with SMTP id ko10mr3237271pbc.125.1331591055411; Mon, 12 Mar 2012 15:24:15 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Mon, 12 Mar 2012 15:23:55 -0700 (PDT) From: Ole Tange Date: Mon, 12 Mar 2012 23:23:55 +0100 X-Google-Sender-Auth: -WryPlZRXOz89XqbWAaI4iqmVkI Message-ID: Subject: Directory for shared locks To: parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2012 22:24:24 -0000 Currently ~/.parallel/ is used for locks, control paths for -M, and temporary files for loadavg and swap_activity. This is chosen because if you run on multiple servers ~/ is likely to be shared amongst them, where as /tmp is likely to be local to the machine. And you want a shared lock so that if you run 'sem --id foo' then only one process is being run no matter how many machines it is being run on. Apparently some users do not have write access to their home dir(!), so I am considering having a variable so you can point ~/.parallel to another place. $PARALLELDIR or $PARALLEL_DIR seem like reasonable choices for me. They will of course default to ~/.parallel. Do you have a better idea for solving the issue or for the variable name? /Ole From MAILER-DAEMON Wed Mar 14 17:34:49 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S7vqL-0007ym-Ih for mharc-parallel@gnu.org; Wed, 14 Mar 2012 17:34:49 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38936) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S7TSk-0004lt-DP for parallel@gnu.org; Tue, 13 Mar 2012 11:16:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S7TSf-00021B-37 for parallel@gnu.org; Tue, 13 Mar 2012 11:16:33 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:40627 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S7TSe-00020t-TM for parallel@gnu.org; Tue, 13 Mar 2012 11:16:29 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S7TSc-0002DN-28 for parallel@gnu.org; Tue, 13 Mar 2012 16:16:26 +0100 Received: from p54b0bad6.dip.t-dialin.net ([84.176.186.214] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S7TSZ-0007m6-VE for parallel@gnu.org; Tue, 13 Mar 2012 16:16:24 +0100 Message-ID: <4F5F64C7.30300@med.uni-frankfurt.de> Date: Tue, 13 Mar 2012 16:16:23 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: parallel@gnu.org Subject: comprehension questions X-Enigmail-Version: 1.3.5 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-Mailman-Approved-At: Wed, 14 Mar 2012 17:34:48 -0400 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2012 15:16:40 -0000 Hi there ... My name is Thomas, I'm new to GNU parallel. I read about the "Whitney" release in the German Linux Magazin. I read the entire Manpage and watched Ole's HowTo videos, twice. Then I started to read the Manpage again and mark everything I still do not understand. While I might be able to understand what these passages are about by "trial and error", I think it is better to ask on this list and we maybe find a better description to be included in the next release. Thomas From MAILER-DAEMON Thu Mar 15 09:30:49 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8AlV-0001eY-NA for mharc-parallel@gnu.org; Thu, 15 Mar 2012 09:30:49 -0400 Received: from eggs.gnu.org ([208.118.235.92]:52328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8AlT-0001de-Ex for parallel@gnu.org; Thu, 15 Mar 2012 09:30:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8AlN-000263-78 for parallel@gnu.org; Thu, 15 Mar 2012 09:30:46 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:56687 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8AlM-00025N-VJ for parallel@gnu.org; Thu, 15 Mar 2012 09:30:41 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8AlJ-0007dR-Vt for parallel@gnu.org; Thu, 15 Mar 2012 14:30:38 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8AlI-0003oL-0P for parallel@gnu.org; Thu, 15 Mar 2012 14:30:36 +0100 Message-ID: <4F61EEFA.7000209@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 14:30:34 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: "parallel will behave similar to ..." X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 13:30:48 -0000 Hi there ... I vote for changing the first paragraph in OPTIONS: | | If command is given, GNU parallel will behave | similar to xargs. | I think this misleading, as in fact it is more like this: parallel <--> xargs -n 1 parallel -m <--> xargs Ole is not sure what'd be the least confusing. He suggested to discuss this on the mailinglist, so here we are. I think we should mention "-n 1" to help people that do not read the entire manpage. Thomas From MAILER-DAEMON Thu Mar 15 09:50:59 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8B51-0000xO-Qm for mharc-parallel@gnu.org; Thu, 15 Mar 2012 09:50:59 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33103) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8B4r-0000tk-Ra for parallel@gnu.org; Thu, 15 Mar 2012 09:50:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8B4U-00006m-S4 for parallel@gnu.org; Thu, 15 Mar 2012 09:50:49 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:57856 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8B4U-00006W-Md for parallel@gnu.org; Thu, 15 Mar 2012 09:50:26 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8B4S-0000BC-8E for parallel@gnu.org; Thu, 15 Mar 2012 14:50:24 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8B4O-0006gz-Ks for parallel@gnu.org; Thu, 15 Mar 2012 14:50:20 +0100 Message-ID: <4F61F39A.3090603@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 14:50:18 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: feature request: {#} with leading zeroes X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 13:50:59 -0000 There is an example for {#} in the manpage. Its description says, it might be "useful for making input PNG's for ffmpeg": | | find . -type f | sort | parallel convert {} {#}.png | I'd guess that it wouldn't work with ffmpeg, as the images would be sorted like this: 10.png 11.png 12.png [...] 18.png 19.png 1.png [...] Wouldn't it be usefull to have sequence numbers with leading zeroes here? I know that parallel doesn't know the number of images to rename and so it cannot find the number of leading zeoes on its own, but maybe there could be a commandline switch for {#}'s (minimum) arity? Thomas From MAILER-DAEMON Thu Mar 15 10:03:08 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8BGm-0003AY-RV for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:03:08 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56435) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BGb-0002iz-Tx for parallel@gnu.org; Thu, 15 Mar 2012 10:03:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8BGV-0002fD-Vd for parallel@gnu.org; Thu, 15 Mar 2012 10:02:57 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:43528) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BGV-0002et-Oa for parallel@gnu.org; Thu, 15 Mar 2012 10:02:51 -0400 Received: by ggeq1 with SMTP id q1so3664856gge.0 for ; Thu, 15 Mar 2012 07:02:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=b7r8vKALJ6xDFwKcqSjIZPD9QnW6pZdgLeI/T/jCZQ4=; b=vXVkXOt0AjB3t35KHET/EJmONK3h0dz6eDv/8v9QLb3uwdJ3Kwx1SHsdh1uf0HLc0q Rp4Av/sjh5shgTIw4t+JEOjkiea2UmlnOTsGt/H8aOUpXBbmYx5c9vtTAZWkXsigAUyv HDBWYSoMZ+BV/HCb81Qg2bf+3GXMl/eCkg37vp7Tmk5KNxJnkpjr92HSylUe0UJze4WW g+glWE4mX1PhVCGhJEzqA/JoEjk5u1XXQ7N5WrmoUe3DQy45E6Oe5tMO3RMz46cOVyYH J1xjFtmJ7atWAsCYV+C9rkrzM+MGfP74OEbpdYJXkzJHJ1OCoItHsVrXBjyF9GuLl77S HJzQ== Received: by 10.68.230.41 with SMTP id sv9mr5324754pbc.48.1331820168609; Thu, 15 Mar 2012 07:02:48 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 07:02:28 -0700 (PDT) In-Reply-To: <4F61EEFA.7000209@med.uni-frankfurt.de> References: <4F61EEFA.7000209@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 15:02:28 +0100 X-Google-Sender-Auth: bHXW0ujv8U_WKATHIbVasQaydJE Message-ID: Subject: Re: "parallel will behave similar to ..." To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:03:07 -0000 On Thu, Mar 15, 2012 at 2:30 PM, Thomas Sattler wrote: > | If command is given, GNU parallel will behave > | similar to xargs. > > I think this misleading, as in fact it is more like > this: > > =A0parallel =A0 =A0 <--> =A0xargs -n 1 > =A0parallel -m =A0<--> =A0xargs > > Ole is not sure what'd be the least confusing. He > suggested to discuss this on the mailinglist, so > here we are. I think we should mention "-n 1" to > help people that do not read the entire manpage. The problem here is that 'parallel foo' is even closer similar to: xargs -I {} -P number_of_cores -n1 bash -c 'foo' By not putting any options in we are simply saying parallel can solve the same kind of problems as xargs. /Ole From MAILER-DAEMON Thu Mar 15 10:06:35 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8BK7-0003SV-LE for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:06:35 -0400 Received: from eggs.gnu.org ([208.118.235.92]:32907) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BJh-0002i6-Pj for parallel@gnu.org; Thu, 15 Mar 2012 10:06:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8BJb-0003Xg-Jv for parallel@gnu.org; Thu, 15 Mar 2012 10:06:09 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:58973 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BJb-0003Wo-De for parallel@gnu.org; Thu, 15 Mar 2012 10:06:03 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BJZ-0000wN-De for parallel@gnu.org; Thu, 15 Mar 2012 15:06:01 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BJX-0006MK-HQ for parallel@gnu.org; Thu, 15 Mar 2012 15:05:59 +0100 Message-ID: <4F61F746.4040704@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:05:58 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: manpage: --delimiter X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:06:34 -0000 Hi again ... Could someone please enlight me what's the usecase of --delimiter? I'd guess that values within lines should be split by delim while input records (lines) are still terminated by \n. But what would be the difference to --colsep in this case? If the assumption above is wrong, what does "This can be used when the input consists of simply newline-separated items" mean? And why would I want to use --null in that case? I'm quite confused by this passage of the manpage. Thomas From MAILER-DAEMON Thu Mar 15 10:20:47 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8BXr-0001V7-2d for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:20:47 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33543) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BXj-0001Hc-UX for parallel@gnu.org; Thu, 15 Mar 2012 10:20:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8BXi-0008KZ-C8 for parallel@gnu.org; Thu, 15 Mar 2012 10:20:39 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:59945 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BXi-0008KM-6B for parallel@gnu.org; Thu, 15 Mar 2012 10:20:38 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BXg-0001cs-On for parallel@gnu.org; Thu, 15 Mar 2012 15:20:36 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BXd-0000OA-5t for parallel@gnu.org; Thu, 15 Mar 2012 15:20:33 +0100 Message-ID: <4F61FAB0.9090208@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:20:32 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: what's the use case of an eof-string? X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:20:45 -0000 Hi there ... eof-strings are only mentioned in the OPTIONS part, but there aren't any examples for this. Could some- body please give a real-live usecase for it? Thomas From MAILER-DAEMON Thu Mar 15 10:34:07 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Bkl-0008D5-Bd for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:34:07 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36752) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Bka-00086J-2C for parallel@gnu.org; Thu, 15 Mar 2012 10:34:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8BkT-0002zO-Ox for parallel@gnu.org; Thu, 15 Mar 2012 10:33:55 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:60891 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8BkT-0002wA-IF for parallel@gnu.org; Thu, 15 Mar 2012 10:33:49 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BkR-0002GU-7N for parallel@gnu.org; Thu, 15 Mar 2012 15:33:47 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8BkO-0001NH-Kp for parallel@gnu.org; Thu, 15 Mar 2012 15:33:44 +0100 Message-ID: <4F61FDC7.3000804@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:33:43 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: bad example: GNU Parallel as dir processor X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:34:05 -0000 Hi there ... The manpage says, when grouping is disabled, "Output is printed as soon as possible" and "the outputs from different commands are mixed together". As far as I understand, that means not only lines might appear in quite a funny oder, it might also be, that one line of out- put is a mixture of several lines of different jobs. And than there is this example: | | GNU Parallel as dir processor | | [...] | The -u is needed because of a small bug in GNU parallel. If | that proves to be a problem, file a bug report. | In case the mentioned bug has been fixed in the meantime, the example should be updated and the -u should be removed. (As well as the description about an existing bug.) In case the mentioned bug has not already been fixed, I'd vote for removing the whole example, as it's just working. Thomas From MAILER-DAEMON Thu Mar 15 10:52:40 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8C2i-0000HD-G6 for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:52:40 -0400 Received: from eggs.gnu.org ([208.118.235.92]:40858) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C2Z-0000Bi-4V for parallel@gnu.org; Thu, 15 Mar 2012 10:52:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8C2S-0007LO-Tu for parallel@gnu.org; Thu, 15 Mar 2012 10:52:30 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:33879 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C2S-0007LB-KA for parallel@gnu.org; Thu, 15 Mar 2012 10:52:24 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C2Q-00037z-0n for parallel@gnu.org; Thu, 15 Mar 2012 15:52:22 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C2M-0002ja-II for parallel@gnu.org; Thu, 15 Mar 2012 15:52:18 +0100 Message-ID: <4F620221.3060304@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:52:17 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: questions about --load X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:52:39 -0000 Hi there ... I'd vote for improving the manpage about --load: | | Only difference is 0 which actually means 0. | Question: Does "--load 0" wait until the system is completely idle (0.00) or just as long as the load drops below 1 (0.x)? | | The load average is only sampled every 10 seconds to avoid | stressing small computers. | Question: How is the load sampled? top/uptime/etc. know load1/ load5/load15. Does GNU parallel look at one of these every 10s? I think the manpage shouldn't leave this as an exercise to the users and instead should be modified to answer these questions. Thomas From MAILER-DAEMON Thu Mar 15 10:52:59 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8C31-0000Rl-Jo for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:52:59 -0400 Received: from eggs.gnu.org ([208.118.235.92]:40933) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C2u-0000Oe-Ko for parallel@gnu.org; Thu, 15 Mar 2012 10:52:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8C2s-0007OO-SL for parallel@gnu.org; Thu, 15 Mar 2012 10:52:52 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:33906 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C2s-0007OA-MP for parallel@gnu.org; Thu, 15 Mar 2012 10:52:50 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C2q-000392-U6 for parallel@gnu.org; Thu, 15 Mar 2012 15:52:48 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C2l-0002lT-3t for parallel@gnu.org; Thu, 15 Mar 2012 15:52:43 +0100 Message-ID: <4F62023A.8090900@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:52:42 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: issues with --load X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:52:58 -0000 Hi again ... The idea behind "--load" is great, but I think it's not working that good. I'd vote for a mechanism of delayed job-starts when "--load" is in use: I tried running lrzip in parallel. It's a multithreaded compressor written by Con Kolivas, optimized for large files. I used "--load" to make sure it will not overload my system, but it still did. I used a machine with 32 cores and it started 32 jobs, and the load went up to >90 for the first round of files to be compressed. No new jobs were started until load dropped below 32, but it's still not what a user might have expected. I know this is tricky as a job might start its helper threads at any time, but maybe parallel should start less jobs at the beginning and have a look at the resulting load for a while. Thomas From MAILER-DAEMON Thu Mar 15 10:59:06 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8C8w-0007PW-45 for mharc-parallel@gnu.org; Thu, 15 Mar 2012 10:59:06 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55610) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C8V-00077l-UM for parallel@gnu.org; Thu, 15 Mar 2012 10:59:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8C8P-0000J1-QJ for parallel@gnu.org; Thu, 15 Mar 2012 10:58:39 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:34204 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8C8P-0000Ie-KG for parallel@gnu.org; Thu, 15 Mar 2012 10:58:33 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C8N-0003Nr-DG for parallel@gnu.org; Thu, 15 Mar 2012 15:58:31 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8C8J-00038t-DW for parallel@gnu.org; Thu, 15 Mar 2012 15:58:27 +0100 Message-ID: <4F620392.7000901@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 15:58:26 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: "Support [...] is limited and may fail" X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 14:59:05 -0000 Hi there ... The subjects line appears three times in the manpage: "Support for --xargs with --sshlogin is limited and may fail." "Support for -m with --sshlogin is limited and may fail." "Support for -X with --sshlogin is limited and may fail." But what exactly are the limits. What is the problem, and when does it happen? The wording does not forbid the usage of -m|-X|--xargs in combination --sshlogin. I'd vote for a little more information about this issue in the manpage. Thomas From MAILER-DAEMON Thu Mar 15 11:04:59 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8CEd-00038g-GJ for mharc-parallel@gnu.org; Thu, 15 Mar 2012 11:04:59 -0400 Received: from eggs.gnu.org ([208.118.235.92]:54772) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8CEY-00036t-D3 for parallel@gnu.org; Thu, 15 Mar 2012 11:04:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8CE6-00024s-BY for parallel@gnu.org; Thu, 15 Mar 2012 11:04:53 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:34585 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8CE6-00024W-5Q for parallel@gnu.org; Thu, 15 Mar 2012 11:04:26 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8CE3-0003dp-OT for parallel@gnu.org; Thu, 15 Mar 2012 16:04:23 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8CE1-0001tv-89 for parallel@gnu.org; Thu, 15 Mar 2012 16:04:21 +0100 Message-ID: <4F6204F4.6050502@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 16:04:20 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: Re: "parallel will behave similar to ..." References: <4F61EEFA.7000209@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 15:04:58 -0000 (I'm sorry, Ole, for you getting this twice. I made a mistake and sent my answer to you, not the list.) >> | If command is given, GNU parallel will behave >> | similar to xargs. > > By not putting any options in we are simply saying parallel > can solve the same kind of problems as xargs. So, why not just use this sentence in the manpage? "parallel can solve the same kind of problems as xargs" The way it currently is could be taken literally. In fact, that's what I did, and it was quite con- fusing until I read the complete manpage. Thomas From MAILER-DAEMON Thu Mar 15 14:10:39 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8F8J-0006SG-6A for mharc-parallel@gnu.org; Thu, 15 Mar 2012 14:10:39 -0400 Received: from eggs.gnu.org ([208.118.235.92]:58920) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8F7u-0006El-Pm for parallel@gnu.org; Thu, 15 Mar 2012 14:10:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8F7o-0002Ep-VG for parallel@gnu.org; Thu, 15 Mar 2012 14:10:14 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:46786) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8F7o-0002D7-Ln for parallel@gnu.org; Thu, 15 Mar 2012 14:10:08 -0400 Received: by wibhj13 with SMTP id hj13so7350112wib.12 for ; Thu, 15 Mar 2012 11:10:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=HxvzwmyqLeub3c0ggLxZXhFMuEgAZ8YKTnbO+SiVV4Q=; b=DyoDg4US8WAhrfThp1ArIwl2gdMEfv/k1268+Kvxh/Fh50FCXfmbcKKHvz700KdsrJ rKWN2dc9VTpVF2du67ukApgbgjauHPLy58/sJtXS8hQrI/6Izdl/a7vc1biAMumAl1sK WpH6hz+mUF42mtPrAdPXfBHsQn1PzoKHw7ItRtdR8ajBv8PQ3W01V9wREWO/NexQKI0V Cl7y/SvzfbzBYxMMLAQDPls7ZriwfFcQZafx/0zhWxzOr0T+y9tR4J9FTadbhE5b2hXQ If259Ah9FTVux53HHG+lUzUqwBwhTSt1mNFVSEWJ/H0Zp4lJd3hi+bD4qaW4g1Minlx7 q0sA== MIME-Version: 1.0 Received: by 10.180.88.67 with SMTP id be3mr17825017wib.20.1331835005841; Thu, 15 Mar 2012 11:10:05 -0700 (PDT) Received: by 10.180.107.197 with HTTP; Thu, 15 Mar 2012 11:10:05 -0700 (PDT) Date: Thu, 15 Mar 2012 14:10:05 -0400 Message-ID: Subject: Any tips about parallel and sem when using intensive I/O operations? From: ningyi shao To: parallel@gnu.org Content-Type: multipart/alternative; boundary=f46d044481476c304b04bb4c0042 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.212.171 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 18:10:37 -0000 --f46d044481476c304b04bb4c0042 Content-Type: text/plain; charset=ISO-8859-1 Now I am using parallel (in fact, sem) to run samtools and other next generation sequencing analysis. Some things are quite similar as this blog described: http://zvfak.blogspot.com/2012/02/samtools-in-parallel.html But I like use sem in such way: export PRO="${HOME}/projects/2012-03-09_H3K4me3" > export RESULT="${PRO}/result/ngs.plot/2012-03-15" > export DATA="${PRO}/data/reheader" > mkdir -p ${RESULT} > > INPUTS=("Sample_H" "Sample_G") > > # setup tagdirectory of inputs > for INPUT in ${INPUTS[@]};do > sem -j4 samtools rmdup -s ${DATA}/${INPUT}.bam > ${RESULT}/${INPUT}_rmdup.bam > done > > TREATS=("Sample_D" "Sample_E" "Sample_F") > for TREAT in ${TREATS[@]};do > sem -j4 samtools rmdup -s ${DATA}/${TREAT}.bam > ${RESULT}/${TREAT}_rmdup.bam > done > sem -w > But I met some problems as when the load of the server heavy, then the output of the sem sometimes will lose output randomly. At the step there is no error report in the log files. The further processing then report and I feel quite trouble to trace back the problem because I didn't get clear clue about it. For example, a line of perfect bed file may should be: chr1 1000 2000 tag1 256 + and the result I found the line that was the output of sem subthread is : chr I know I'd better provide clear steps and example to repeat such problem, but it is quite randomly and what I only know is that it highly relates to high I/O intensive operations, especially when I use pipe. Does any body also meet such problem? Or I am the only one met such problem? Are there some tips that I could find the error early and trace the problem? I read the man page of sem, but no clear clues about such problem, and I also search the mail-archive.com about parallel, no clear solution for it. Thank you for my trivial problem. Best, Ning-Yi SHAO --f46d044481476c304b04bb4c0042 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Now I am using parallel (in fact, sem) to run samtools and other next gener= ation sequencing analysis.
Some things are quite similar as this blog de= scribed:
http://zvfak.blogspot.com/2012/02/samtools-in-parallel.html But I like use sem in such way:

export PRO=3D"${HOME}/projects/2012-03-09_H3K4me3"<= br> export RESULT=3D"${PRO}/result/ngs.plot/2012-03-15"
export DAT= A=3D"${PRO}/data/reheader"
mkdir -p ${RESULT}

INPUTS=3D= ("Sample_H" "Sample_G")

# setup tagdirectory of = inputs
for INPUT in ${INPUTS[@]};do
=A0=A0=A0 sem -j4 samtools rmdup -s ${DATA}= /${INPUT}.bam ${RESULT}/${INPUT}_rmdup.bam
done

TREATS=3D("S= ample_D" "Sample_E" "Sample_F")
for TREAT in ${= TREATS[@]};do
=A0=A0=A0 sem -j4 samtools rmdup -s ${DATA}/${TREAT}.bam ${RESULT}/${TREAT}= _rmdup.bam
done
sem -w

But I met some problems as= when the load of the server heavy, then the output of the sem sometimes wi= ll lose output randomly. At the step there is no error report in the log fi= les. The further processing then report and I feel quite trouble to trace b= ack the problem because I didn't get clear clue about it. For example, = a line of perfect bed file may should be:
chr1 1000 2000 tag1 256 +
and the result I found the line that was the o= utput of sem subthread is :
chr

I know I'd better provide cle= ar steps and example to repeat such problem, but it is quite randomly and w= hat I only know is that it highly relates to high I/O intensive operations,= especially when I use pipe. Does any body also meet such problem? Or I am = the only one met such problem?
Are there some tips that I could find the error early and trace the problem= ? I read the man page of sem, but no clear clues about such problem, and I = also search the mail-archive.com ab= out parallel, no clear solution for it.

Thank you for my trivial problem.

Best,

Ning-Yi SHAO
--f46d044481476c304b04bb4c0042-- From MAILER-DAEMON Thu Mar 15 15:31:27 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8GOV-0004QL-Nr for mharc-parallel@gnu.org; Thu, 15 Mar 2012 15:31:27 -0400 Received: from eggs.gnu.org ([208.118.235.92]:37567) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GOS-0004OS-OM for parallel@gnu.org; Thu, 15 Mar 2012 15:31:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8GON-0007pg-RI for parallel@gnu.org; Thu, 15 Mar 2012 15:31:24 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:44311) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GON-0007mp-Hl for parallel@gnu.org; Thu, 15 Mar 2012 15:31:19 -0400 Received: by dadv6 with SMTP id v6so5493405dad.0 for ; Thu, 15 Mar 2012 12:31:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=IvGB9nRstJQHzXMvwefftwAYbA0JYHIBcgMNiGM7ems=; b=eS+yLyT178sS6akXEjNuZRtUIXakcoPXp2B5EEv6ay/Fu3Y4vZNc1LRJXfUaR0Wp8/ OQp7mkLF/4AmEQ+kJ35q1LSHBJX6ozoe9qjAwIzPqBDqlBIvb2pDCb54CWJFR2gfqFKD gie6wYie1ZkzvPi9rJp3MYkUGoFDC/RQiKGqkzpyvutmOS4ZlL/A7Dfmy3ikDNnQlUaS EroSgYSXO9pdpRJFR9RtWqRispp5qxEbEPn76v/PB60HDvMZ4Nq8JcAXrW/cd4PAG79g yTtcf17aLoiilqODfnrc4VVV61mfbTTvOyeF11BZFHzZmapd9zsRvwCLz8h2B0ArzKuQ WNYg== Received: by 10.68.224.225 with SMTP id rf1mr7342136pbc.133.1331839877305; Thu, 15 Mar 2012 12:31:17 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 12:30:57 -0700 (PDT) In-Reply-To: <4F61FDC7.3000804@med.uni-frankfurt.de> References: <4F61FDC7.3000804@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 20:30:57 +0100 X-Google-Sender-Auth: OycxuL3Kzd8uzCeaZuxDXhgxv40 Message-ID: Subject: Re: bad example: GNU Parallel as dir processor To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 19:31:26 -0000 On Thu, Mar 15, 2012 at 3:33 PM, Thomas Sattler wrote: > Hi there ... > > The manpage says, when grouping is disabled, "Output is printed > as soon as possible" and "the outputs from different commands > are mixed together". > > As far as I understand, that means not only lines might appear > in quite a funny oder, it might also be, that one line of out- > put is a mixture of several lines of different jobs. That is correct. Unless you redirect the output: ... | parallel -u echo '>' {}.out '2>' {}.err The above will work just fine. > And than there is this example: > > | > | GNU Parallel as dir processor > | > | [...] > | The -u is needed because of a small bug in GNU parallel. If > | that proves to be a problem, file a bug report. > | > : > In case the mentioned bug has not already been fixed, I'd vote > for removing the whole example, as it's just working. The bug has not been fixed. To see it in action try: inotifywait -q -m -r -e CLOSE_WRITE --format %w%f . | parallel -j 2 echo & sleep 2 touch nooutput1 sleep 2 touch nooutput2 sleep 2 touch now_two_outputs sleep 2 touch now_one_more_output Compare that to the same with -u: inotifywait -q -m -r -e CLOSE_WRITE --format %w%f . | parallel -u -j 2 echo & sleep 2 touch nooutput1 sleep 2 touch nooutput2 sleep 2 touch now_two_outputs sleep 2 touch now_one_more_output I will consider fixing the bug when someone shows it is a real problem to them or if someone provides a beautiful patch. However, the dir processing is being used by some users so removing it as an example does not sound like a good idea. /Ole From MAILER-DAEMON Thu Mar 15 15:42:44 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8GZQ-000110-0Y for mharc-parallel@gnu.org; Thu, 15 Mar 2012 15:42:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:53783) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GZ5-0000xG-JT for parallel@gnu.org; Thu, 15 Mar 2012 15:42:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8GZ2-0001y0-7H for parallel@gnu.org; Thu, 15 Mar 2012 15:42:23 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:34829) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GZ2-0001xv-0z for parallel@gnu.org; Thu, 15 Mar 2012 15:42:20 -0400 Received: by ggeq1 with SMTP id q1so4099838gge.0 for ; Thu, 15 Mar 2012 12:42:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=8ePbrmhdqUGAu+2zXaFRGIQlnaThysU1Clut1/2mfRg=; b=bBywup/X1Kumhx4KkpC1bhkwLJCBUTkwgASdsn/KTH3cIm50mp6nT03cZCM6OYtvU6 jar8z7/7G9+BhR4+m5vhWCQQd7R3VHRAykxpn207w4kJtXUFuEvUJXeQwFCS0xskvM8I oaiSJ80ea+bf9q3f3h+utCkP8qGMu56qtiALXCm4c1h2+5JD4OqgLgu+pxMt3pKrrNsK vhVW10k5v+BbF+sZxUHX70vNyMQfj6u0RIDV2eH5Dot+e0yZT2Y7fSCuaqkBwTwy7tnW o0Tlx+2OtODyfyiBSUaZb+cQgtB5SNZat9xvRS2EuTJCt/uXruS4B8o02baHfZlvyrMw VnOQ== Received: by 10.68.240.41 with SMTP id vx9mr7503831pbc.10.1331840537425; Thu, 15 Mar 2012 12:42:17 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 12:41:57 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Thu, 15 Mar 2012 20:41:57 +0100 X-Google-Sender-Auth: VemSYRcgTRos1mmOHCbVGgkH1QA Message-ID: Subject: Re: Any tips about parallel and sem when using intensive I/O operations? To: ningyi shao Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 19:42:43 -0000 On Thu, Mar 15, 2012 at 7:10 PM, ningyi shao wrote: > Now I am using parallel (in fact, sem) to run samtools and other next > generation sequencing analysis. > Some things are quite similar as this blog described: > http://zvfak.blogspot.com/2012/02/samtools-in-parallel.html > But I like use sem in such way: > >> export PRO=3D"${HOME}/projects/2012-03-09_H3K4me3" >> export RESULT=3D"${PRO}/result/ngs.plot/2012-03-15" >> export DATA=3D"${PRO}/data/reheader" >> mkdir -p ${RESULT} >> >> INPUTS=3D("Sample_H" "Sample_G") >> >> # setup tagdirectory of inputs >> for INPUT in ${INPUTS[@]};do >> =A0=A0=A0 sem -j4 samtools rmdup -s ${DATA}/${INPUT}.bam >> ${RESULT}/${INPUT}_rmdup.bam >> done >> >> TREATS=3D("Sample_D" "Sample_E" "Sample_F") >> for TREAT in ${TREATS[@]};do >> =A0=A0=A0 sem -j4 samtools rmdup -s ${DATA}/${TREAT}.bam >> ${RESULT}/${TREAT}_rmdup.bam >> done >> sem -w Do you get the same problem with: parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam ::: "${INPUTS[@]}" parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam ::: "${TREATS[@]}" or even: parallel samtools rmdup -s ${DATA}/${}.bam ${RESULT}/${}_rmdup.bam ::: "${INPUTS[@]}" "${TREATS[@]}" > But I met some problems as when the load of the server heavy, then the > output of the sem sometimes will lose output randomly. As you can imagine it is hard for others to reproduce that problem. That is why the man page says: Your bug report should always include: =B7 The output of parallel --version. If you are not running the latest released version you should specify why you believe the problem is not fixed in that version. =B7 A complete example that others can run that shows the problem. A combination of seq, cat, echo, and sleep can reproduce most errors. If your example requires large files, see if you can make them by something like seq 1000000 >file. If you suspect the error is dependent on your distribution, please see if you can reproduce the error on one of these VirtualBox images: http://sourceforge.net/projects/virtualboximage/files/ Specifying the name of your distribution is not enough as you may have installed software that is not the the VirtualBox images. /Ole From MAILER-DAEMON Thu Mar 15 15:52:18 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Gig-0007B7-MR for mharc-parallel@gnu.org; Thu, 15 Mar 2012 15:52:18 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33232) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GiI-0006wR-OM for parallel@gnu.org; Thu, 15 Mar 2012 15:52:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8GiD-0003LS-R9 for parallel@gnu.org; Thu, 15 Mar 2012 15:51:54 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:36062) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8GiD-0003LK-IC for parallel@gnu.org; Thu, 15 Mar 2012 15:51:49 -0400 Received: by dadv6 with SMTP id v6so5516615dad.0 for ; Thu, 15 Mar 2012 12:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=FOu/NBzXTBPC2U9/5ZgZtwYoaFVQIquch4ldmb0tQdg=; b=DwPwpoXd0QuFAxAn/m0/BF9Uj81ejApeGs3dIXP4v6fIgnn1EwfLvwDfnzWU9A+L4d M/SUnMHMr7J7Rm42UIpbZ7VZYbPuCuY5GUwU9YbOswujizUEGOyq6Vj9qvzwA/ZCCH6s RxwWByyyAluwrlzD8F3VPFiAYpfUzP1KVGMdATl9lOdsa0HI8Fe+rHBkQtwO/HrUTkJl tjyieJNnVCy76Zsc1e8tY/2yMYxij2HCKgMlRcrrBXL2IHXaXoY05Ja4e3svLATqzN+M 5v+OXMLwm2KnOCqZXCZwrX4YtfujSdkn7f2UALaZMzrskAiSrgPoShyg9PkMo7Fb/AXg y4gg== Received: by 10.68.203.74 with SMTP id ko10mr7507684pbc.125.1331841107370; Thu, 15 Mar 2012 12:51:47 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 12:51:27 -0700 (PDT) In-Reply-To: <4F620392.7000901@med.uni-frankfurt.de> References: <4F620392.7000901@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 20:51:27 +0100 X-Google-Sender-Auth: WN5UlZFZGtj-O0ilFUCZnRxgfwA Message-ID: Subject: Re: "Support [...] is limited and may fail" To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 19:52:17 -0000 On Thu, Mar 15, 2012 at 3:58 PM, Thomas Sattler wrote: > The subjects line appears three times in the manpage: > > =A0"Support for --xargs with --sshlogin is limited and may fail." > =A0"Support for -m with --sshlogin is limited and may fail." > =A0"Support for -X with --sshlogin is limited and may fail." > > But what exactly are the limits. What is the problem, > and when does it happen? The wording does not forbid > the usage of -m|-X|--xargs in combination --sshlogin. > > I'd vote for a little more information about this > issue in the manpage. Action speaks louder than words: Instead of voting for it, please do the investigation to figure out the limits, and why, how and when it fails. Then write your suggestion for that section of the man page. I can just see it sometimes fails, and because: * I never use it, and * no one has given a good case for using it, and * no one has provided a beautiful patch then it has not been fixed nor elaborated. In other words: it is not worth my time to fix it. If you find it is worth your time, you are most certainly welcome. /Ole From MAILER-DAEMON Thu Mar 15 16:03:23 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8GtP-0006Vn-6n for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:03:23 -0400 Received: from eggs.gnu.org ([208.118.235.92]:60336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Gt4-0006Pc-9b for parallel@gnu.org; Thu, 15 Mar 2012 16:03:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Gsx-0005lq-VB for parallel@gnu.org; Thu, 15 Mar 2012 16:03:01 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:42636) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Gsx-0005lg-Le for parallel@gnu.org; Thu, 15 Mar 2012 16:02:55 -0400 Received: by dadv6 with SMTP id v6so5528750dad.0 for ; Thu, 15 Mar 2012 13:02:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=Av4suW33xNdRu5SriPycmtdPQLjtNuuV5IbjFZSfIp4=; b=xiDRmm3ZlGy15GBsaF73kdPd0b/Yl9tkiJR8a/vAZhs7F1IncYH5bEwe3wM9DIW49c tK/n6ooQmGpVGk6ZNjCYSWnJzx2bgUSi0FmeRZGNRbQ3DckiqgXLMWLIq0Zt75h1QoeG yjdrVDlUqvrN65NbeLOjA7LcFItUpMXVyBDlkrO2F3GLPrrFqMLt9guJdNrzEICnQisx JQLlOnJuTO3ARfLhDifY/EzRt4OlQZVcF0X2FOOdRRuW6UXyi/gyrXlPoK1/x9czJdwb O7xh/w7sb14G8nBnBQ475sGEp/01fYr1vQws1JaRD+gj35u3+AhW0e9S2w9Y7M+WaVAa y4RA== Received: by 10.68.230.41 with SMTP id sv9mr7488135pbc.48.1331841773610; Thu, 15 Mar 2012 13:02:53 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:02:33 -0700 (PDT) In-Reply-To: <4F62023A.8090900@med.uni-frankfurt.de> References: <4F62023A.8090900@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:02:33 +0100 X-Google-Sender-Auth: I637hY9srMDU8Suz2G1m5fDCY8s Message-ID: Subject: Re: issues with --load To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:03:22 -0000 On Thu, Mar 15, 2012 at 3:52 PM, Thomas Sattler wrote: > The idea behind "--load" is great, but I think it's not working that > good. I'd vote for a mechanism of delayed job-starts when "--load" > is in use. I can see a delay mechanism can be useful elsewhere, but it is not that hard for you to do yourself: # Delay the first 4 jobs 2, 4, 6, 8 seconds. seq 10 | parallel '[ {#} -lt 5 ] && sleep $(({#}*2)); echo foo $(({#}*2)) {}' /Ole From MAILER-DAEMON Thu Mar 15 16:25:19 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HEd-0001mx-91 for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:25:19 -0400 Received: from eggs.gnu.org ([208.118.235.92]:37729) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HEa-0001lc-Hd for parallel@gnu.org; Thu, 15 Mar 2012 16:25:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HEY-0001ed-Mn for parallel@gnu.org; Thu, 15 Mar 2012 16:25:16 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:63685) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HEY-0001eQ-GI for parallel@gnu.org; Thu, 15 Mar 2012 16:25:14 -0400 Received: by ggeq1 with SMTP id q1so4153487gge.0 for ; Thu, 15 Mar 2012 13:25:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=xn/xC61b7BLRKyDUD5F6D77Y6uXqQPVlKarZQ4shjXo=; b=zDiH4c2SWzODJFl8D0Tz9aBooAoCDKCYn/7hzi0gUQJgg6L2J4FC/Wh55NMUTSGbpJ QhfbrO5PP2fevbkhlsmnWENupv1QWLFqFrWA19WhoYLR7ke2WlQb3HRatB5jFbi5FD3A ArqKuBxY34Tv1bzAIYAQQqyEuZQjiTMMI/goSg7mfA3no6zu7Qm3xT/jTJqQ3KzrxEVp 3r3VrP1DZYsXkZTkp6SUU0yP+9U+PmJFdN91aG+j6T/q1pWDxH5dDg3BRBBIpkGFb4TL BVl8kB7lbsYEfea9+mIJdMyum9dr19CiOrehsMS67W3p5r041ikrKAurxJykii6iFwvr j/vw== Received: by 10.68.240.41 with SMTP id vx9mr7782389pbc.10.1331843112538; Thu, 15 Mar 2012 13:25:12 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:24:52 -0700 (PDT) In-Reply-To: <4F620221.3060304@med.uni-frankfurt.de> References: <4F620221.3060304@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:24:52 +0100 X-Google-Sender-Auth: ZJEMealhPPiHhWFySoHvWHHeTyk Message-ID: Subject: Re: questions about --load To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:25:17 -0000 On Thu, Mar 15, 2012 at 3:52 PM, Thomas Sattler wrote: > Hi there ... > > I'd vote for improving the manpage about --load: : > I think the manpage shouldn't leave this as an exercise to the > users and instead should be modified to answer these questions. Fixed in git version. /Ole From MAILER-DAEMON Thu Mar 15 16:30:44 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HJs-0003lP-Nc for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:30:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:48952) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HJq-0003kh-Ui for parallel@gnu.org; Thu, 15 Mar 2012 16:30:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HJm-0002fq-Du for parallel@gnu.org; Thu, 15 Mar 2012 16:30:42 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:47298 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HJe-0002ec-OF; Thu, 15 Mar 2012 16:30:30 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8HJc-0006v8-8w; Thu, 15 Mar 2012 21:30:28 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8HJa-0004ok-G4; Thu, 15 Mar 2012 21:30:26 +0100 Message-ID: <4F625161.9010502@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 21:30:25 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: bad example: GNU Parallel as dir processor References: <4F61FDC7.3000804@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:30:44 -0000 Hi again ... > However, the dir processing is being used by some users so > removing it as an example does not sound like a good idea. Maybe I missed the point, but isn't it a fact that -u produces unreliable output (due to the racing condition) and so the dir processing will eventually fail? Thomas From MAILER-DAEMON Thu Mar 15 16:37:02 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HPy-0000DT-Rq for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:37:02 -0400 Received: from eggs.gnu.org ([208.118.235.92]:39711) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HPw-0000AV-2H for parallel@gnu.org; Thu, 15 Mar 2012 16:37:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HPu-0003te-8f for parallel@gnu.org; Thu, 15 Mar 2012 16:36:59 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:54763) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HPt-0003qq-UN for parallel@gnu.org; Thu, 15 Mar 2012 16:36:58 -0400 Received: by dadv6 with SMTP id v6so5566960dad.0 for ; Thu, 15 Mar 2012 13:36:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=4cRiehxNTXLz4XveJO52KSI3xDAivRVQKFqr5IBKJos=; b=bN1cOEOa/aKZlQqvUA27HfXajQlLB+uCMWUUovYzKV0Ka2xCpOTZYLrNq56VUbEOuF bs0ab1n2A5nv0TzQQBcHxgR3+pe9tnGxEmOe/6JHM4A4WBpGoDjB2bOEr8QCb8x5MkI+ EK4ujXPPqFPhF+RbMsfaPrCkLa54hPczxplLuq7uRmdS9ZPwzxxtBL2tAxQXQ5yfol1H Ma0uy8j28ds/WJ3r5cN34R1Y0T1Kjbi46Wq6lCmZ00bOeBUEgwTKSa0aoG42VBlFtTSA tstfSKTfjlk6fPoa2gBE+PI3yyMjjnd0bOPgqjCpibypqCxz1uz++KqiwL5mvRy7feal oLhA== Received: by 10.68.240.41 with SMTP id vx9mr7857477pbc.10.1331843815717; Thu, 15 Mar 2012 13:36:55 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:36:31 -0700 (PDT) In-Reply-To: <4F61F39A.3090603@med.uni-frankfurt.de> References: <4F61F39A.3090603@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:36:31 +0100 X-Google-Sender-Auth: GFg6KJHTMMSA1QNnwFnyl-wxmdU Message-ID: Subject: Re: feature request: {#} with leading zeroes To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:37:01 -0000 On Thu, Mar 15, 2012 at 2:50 PM, Thomas Sattler wrote: > There is an example for {#} in the manpage. Its description says, > it might be "useful for making input PNG's for ffmpeg": > > | > | find . -type f | sort | parallel convert {} {#}.png > | > > I'd guess that it wouldn't work with ffmpeg, as the images would > be sorted like this: > > =A010.png 11.png 12.png [...] 18.png 19.png 1.png [...] > > Wouldn't it be usefull to have sequence numbers with leading > zeroes here? Show me more examples where it would be useful. The ffmpeg could be fixed w= ith: ls | sort -n | parallel -j 1 -X ffmpeg ... {} If there are better use cases I might consider something like: {0#} =3D {#} =3D one digit or more {000#} =3D three digits or more /Ole From MAILER-DAEMON Thu Mar 15 16:40:25 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HTF-0001uL-BM for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:40:25 -0400 Received: from eggs.gnu.org ([208.118.235.92]:53939) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HTC-0001tU-QV for parallel@gnu.org; Thu, 15 Mar 2012 16:40:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HTA-0004ap-Kr for parallel@gnu.org; Thu, 15 Mar 2012 16:40:22 -0400 Received: from mail-gy0-f169.google.com ([209.85.160.169]:50650) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HTA-0004aT-EK for parallel@gnu.org; Thu, 15 Mar 2012 16:40:20 -0400 Received: by ghrr18 with SMTP id r18so4185716ghr.0 for ; Thu, 15 Mar 2012 13:40:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=lDUa6YWoPDm6cbuoTOYzJTb2+Raf8N4OBcsP1eiObUY=; b=H3dzcZ6pp+mZt60c3Y332lAJe6tjO3GZ67Xtp7sAvYy4C/HUEfFqDJvuTRO1S1B6Yz bdFQbpT/dSgNH1ITmE6572YS2J4w2mzylQa4Dzj5ife/23PjlFmzSlfriOm9xWicJjDO UJooXwx4ojyrv++ApqvtK2qDr3wRBByGMBYa1pzphEOJqUrHcelkxBLtxsNFBD61HpC8 hITvXEl1H9DABoJ5wAx6SnZcpPF2+cieOHs6MWQo+6U86O7Yv5Qo9fs4yrCfIQZD9j0+ YTFlV+kwjtHMSOdTcjFgifyRp/jxDOc8prgoSBLppNRufh1FvgLgQJAQbrksaLfnM8ya DX8A== Received: by 10.68.240.41 with SMTP id vx9mr7880138pbc.10.1331844018225; Thu, 15 Mar 2012 13:40:18 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:39:57 -0700 (PDT) In-Reply-To: <4F625161.9010502@med.uni-frankfurt.de> References: <4F61FDC7.3000804@med.uni-frankfurt.de> <4F625161.9010502@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:39:57 +0100 X-Google-Sender-Auth: sIlxKmYZEqFaPjVfYXbU_YkVLm4 Message-ID: Subject: Re: bad example: GNU Parallel as dir processor To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:40:24 -0000 On Thu, Mar 15, 2012 at 9:30 PM, Thomas Sattler wrote: > Hi again ... > >> However, the dir processing is being used by some users so >> removing it as an example does not sound like a good idea. > > Maybe I missed the point, but isn't it a fact that -u produces > unreliable output (due to the racing condition) and so the dir > processing will eventually fail? Not at all. It is just the output to stdout that may be mixed up (with -u) or delayed (without -u). Everything else works as expected. /Ole From MAILER-DAEMON Thu Mar 15 16:42:59 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HVj-0002i3-Ks for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:42:59 -0400 Received: from eggs.gnu.org ([208.118.235.92]:54357) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HVP-0002cF-Ad for parallel@gnu.org; Thu, 15 Mar 2012 16:42:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HVN-00054u-Jb for parallel@gnu.org; Thu, 15 Mar 2012 16:42:38 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:55612) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HVN-00054i-DB for parallel@gnu.org; Thu, 15 Mar 2012 16:42:37 -0400 Received: by ggeq1 with SMTP id q1so4174753gge.0 for ; Thu, 15 Mar 2012 13:42:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=GlOVoOZ77HvGTyGFINPjmBE+RBt0SALK7PQMZk6SLYU=; b=v4XUASyWhq8HdxWO2vBU4MutHP5voQEBUl57MOdtWRY5EdSN+ShrWoJ5PAuSEV648j MXe1MtMXyxgMz6gnsQKm+yhP5dMH3M5xMOtfGFNhbkEZq+/xefhaljOyY9jfL3ikgCGA r7nmKJC4I8hA5a/APEzxzC7ezTfhYxSo36L5Z/FP2z/CLtvmvtanfamhGL/ewEtzbtb7 mRRgleRP29cXrliniMUJm+Au4xK9oQz1o0Ka2TD6thtiY3UrNMaaFTqOdb2mOhpKx4V8 MxdHsKNPNpDbxJKAH+PsU6aVG7HednWh873+nsCJ4gq9C7Ds3neXFeNZ+T67YaXaAn0V oHpQ== Received: by 10.68.230.99 with SMTP id sx3mr7839827pbc.55.1331844155446; Thu, 15 Mar 2012 13:42:35 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:42:15 -0700 (PDT) In-Reply-To: <4F61FAB0.9090208@med.uni-frankfurt.de> References: <4F61FAB0.9090208@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:42:15 +0100 X-Google-Sender-Auth: Wee1rGUmiMbeCvqOgjXH5eY92Fo Message-ID: Subject: Re: what's the use case of an eof-string? To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:42:59 -0000 On Thu, Mar 15, 2012 at 3:20 PM, Thomas Sattler wrote: > Hi there ... > > eof-strings are only mentioned in the OPTIONS part, > but there aren't any examples for this. Could some- > body please give a real-live usecase for it? I cannot. But the reason why the options exists is because of xargs compatibility. /Ole From MAILER-DAEMON Thu Mar 15 16:46:30 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HZ8-0004nJ-FV for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:46:30 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45196) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HZ5-0004kw-7E for parallel@gnu.org; Thu, 15 Mar 2012 16:46:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8HZ3-0005fG-BH for parallel@gnu.org; Thu, 15 Mar 2012 16:46:26 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:61659) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8HZ3-0005f9-2Q for parallel@gnu.org; Thu, 15 Mar 2012 16:46:25 -0400 Received: by dadv6 with SMTP id v6so5577485dad.0 for ; Thu, 15 Mar 2012 13:46:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=z0bbIBsu14MkCxqIWSFznksJLO/ELf3wkiIZr41mhIA=; b=j6s9Wc5CCgg/Ngn28XsLCGNcA7Ka1LUAfxWNlKK5nPViFg9A7gJ2hNBPj9PMTCzNx0 VVCCDG6Ep04t1eWMEm0FrTEFxgvZU1OW7ItiHkZGPGU34s5DprGyHrYyIMVeAcqlYLFJ 0+TZfi7WGs4dLCPw6Y7jtlLaHey9j6lo/RA/w+HmAbuuALnFUEcWDoThqf5uVmpjuV4g I9OD6ibmq1Se/zgyolQL8ySnT0vpMHML2SW9cGeQaSbiL1u8PWyULmvcthIwmcUzzSN6 V1OrOua236/NcrD2TtIW1OpXZwG9E4Ln8mEhPhm84K0nAbUltQYWPz/5ozWeH4exJuKe eDSQ== Received: by 10.68.238.1 with SMTP id vg1mr7985936pbc.33.1331844383073; Thu, 15 Mar 2012 13:46:23 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 13:46:01 -0700 (PDT) In-Reply-To: <4F61F746.4040704@med.uni-frankfurt.de> References: <4F61F746.4040704@med.uni-frankfurt.de> From: Ole Tange Date: Thu, 15 Mar 2012 21:46:01 +0100 X-Google-Sender-Auth: 1Aj3T2wSBJdbrdNxSUMC9ePLENM Message-ID: Subject: Re: manpage: --delimiter To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:46:28 -0000 On Thu, Mar 15, 2012 at 3:05 PM, Thomas Sattler wrote: > Hi again ... > > Could someone please enlight me what's the usecase of --delimiter? I cannot. --delimiter is there for xargs compatibility. The text is heavily inspired by the man page for xargs. /Ole From MAILER-DAEMON Thu Mar 15 16:48:21 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Hav-0005zZ-MK for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:48:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45421) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Har-0005vJ-Sh for parallel@gnu.org; Thu, 15 Mar 2012 16:48:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Hao-00065V-C3 for parallel@gnu.org; Thu, 15 Mar 2012 16:48:17 -0400 Received: from mail-yx0-f169.google.com ([209.85.213.169]:35610) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hao-00065P-5W; Thu, 15 Mar 2012 16:48:14 -0400 Received: by yenm8 with SMTP id m8so4172181yen.0 for ; Thu, 15 Mar 2012 13:48:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=N9ICbLYYGcYSQVaVgWW+jtLFNKaj0m5FxWNAufeEXls=; b=bR5QWpdAkY1QkL+jyBZ0tmfaOcv83Y4bbZdmcyrBQTcM8azjjgMIAL4KHLvJC+1oSX PHpapmw+v3BYwnvppqcpkaA1AgeaYiU1Qb6sGAM7Bx7uQSAzAWOud8W7qD8kcOD67Lk7 yydcTe5YbvK1r8ONASGSA6sTbS59HH6VsQ7GioxTnRKFeatvTnccRnxdPW/rrds+WBfv Ye7L5ahYRGBD22ug6hgH7yl7yV+L+i+XWgkJoYChs5TfpJEI3BYaFStf1J9nwaaDm4a0 Ylse6RDBUXinMWcIj9broXRZyT4qeNAibBuYb1ACkeVoKBcGeZ7cB8D9lus3elfwhrO1 kClA== MIME-Version: 1.0 Received: by 10.182.38.8 with SMTP id c8mr51571obk.34.1331844490523; Thu, 15 Mar 2012 13:48:10 -0700 (PDT) Received: by 10.60.19.196 with HTTP; Thu, 15 Mar 2012 13:48:10 -0700 (PDT) In-Reply-To: References: <4F61F39A.3090603@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 13:48:10 -0700 Message-ID: Subject: Re: feature request: {#} with leading zeroes From: Shao Zhang To: Ole Tange Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.213.169 Cc: Thomas Sattler , parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:48:19 -0000 In production, we only really have 2 types of sequential formatting. No leading zeros: 8,9,10,11 and fixed number of digits: 008,009,010,011 'Three digits or more' is not really a case I've seen in the wild. Just my 2 cents having to work with production image sequences on a daily basis. I typically use parallel to convert images out of order in which case this functionality isn't necessary. We do have internal tools that will process sequentially like this: loop 1-1000 -cpus 8 'convert this.%d.tif that.%6d.jpg' Which isn't guaranteed to be 100% sequential when cpus > 1 but its never going to be horribly out of order. I'm not sure if the functionality is necessary unless there's something that absolutely needs to be done in order or roughly in order. You can always break out your data by dirs and run parallel against the ones you want done first. On Thu, Mar 15, 2012 at 1:36 PM, Ole Tange wrote: > On Thu, Mar 15, 2012 at 2:50 PM, Thomas Sattler > wrote: >> There is an example for {#} in the manpage. Its description says, >> it might be "useful for making input PNG's for ffmpeg": >> >> | >> | find . -type f | sort | parallel convert {} {#}.png >> | >> >> I'd guess that it wouldn't work with ffmpeg, as the images would >> be sorted like this: >> >> =A010.png 11.png 12.png [...] 18.png 19.png 1.png [...] >> >> Wouldn't it be usefull to have sequence numbers with leading >> zeroes here? > > Show me more examples where it would be useful. The ffmpeg could be fixed= with: > > ls | sort -n | parallel -j 1 -X ffmpeg ... {} > > If there are better use cases I might consider something like: > > =A0{0#} =3D {#} =3D one digit or more > =A0{000#} =3D three digits or more > > /Ole > From MAILER-DAEMON Thu Mar 15 16:49:51 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8HcN-0006aY-TA for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:49:51 -0400 Received: from eggs.gnu.org ([208.118.235.92]:35670) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hc2-0006T7-GM for parallel@gnu.org; Thu, 15 Mar 2012 16:49:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Hc0-0006E0-QN for parallel@gnu.org; Thu, 15 Mar 2012 16:49:30 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:48061 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hbx-0006DT-On; Thu, 15 Mar 2012 16:49:25 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hbv-0007ZS-66; Thu, 15 Mar 2012 21:49:23 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hbs-0007Se-Be; Thu, 15 Mar 2012 21:49:20 +0100 Message-ID: <4F6255CF.70507@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 21:49:19 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: issues with --load References: <4F62023A.8090900@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:49:50 -0000 >> The idea behind "--load" is great, but I think it's not working that >> good. I'd vote for a mechanism of delayed job-starts when "--load" >> is in use. > > I can see a delay mechanism can be useful elsewhere, but it is not > that hard for you to do yourself: That was not what I wanted. Yes, an experienced user can easily create a script that does this, but my intension is something quite different: GNU parallel does a great job in putting high load on a system. And my question was: Shouldn't we take more care that not-so-experienced users do not overload their machines by accident. (Also see my to-be-written mail about "transfer" and NFS, and how I hit my cluster in a way that I needed to hard-reset the compute nodes to bring the head node back to life. :-)) Thomas From MAILER-DAEMON Thu Mar 15 16:59:35 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Hlm-0003SO-VG for mharc-parallel@gnu.org; Thu, 15 Mar 2012 16:59:34 -0400 Received: from eggs.gnu.org ([208.118.235.92]:52497) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hlk-0003S9-EH for parallel@gnu.org; Thu, 15 Mar 2012 16:59:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Hli-0008Pt-PT for parallel@gnu.org; Thu, 15 Mar 2012 16:59:31 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:48542 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hlf-0008Pb-QR; Thu, 15 Mar 2012 16:59:27 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hld-0007wJ-0b; Thu, 15 Mar 2012 21:59:25 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hla-0006S7-Sk; Thu, 15 Mar 2012 21:59:22 +0100 Message-ID: <4F62582A.5030808@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 21:59:22 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: bad example: GNU Parallel as dir processor References: <4F61FDC7.3000804@med.uni-frankfurt.de> <4F625161.9010502@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 20:59:33 -0000 >> Maybe I missed the point, but isn't it a fact that -u produces >> unreliable output (due to the racing condition) and so the dir >> processing will eventually fail? > > Not at all. It is just the output to stdout that may be mixed up (with > -u) or delayed (without -u). Everything else works as expected. Argh, there it is, my mistake: The 'echo' _is_ the dir processor, I somehow thought it would _feed_ the dir processor. And now it is not the dir processors INPUT that might be unreliable, it's its OUTPUT. And hopefully noone cares about that. Thanks a lot Ole! I've been kind of blind. Thomas From MAILER-DAEMON Thu Mar 15 17:04:53 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Hqv-0000n3-B7 for mharc-parallel@gnu.org; Thu, 15 Mar 2012 17:04:53 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55937) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hqs-0000lZ-A5 for parallel@gnu.org; Thu, 15 Mar 2012 17:04:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Hqq-00015T-LF for parallel@gnu.org; Thu, 15 Mar 2012 17:04:49 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:48664 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Hqm-00014e-7l; Thu, 15 Mar 2012 17:04:44 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hqk-00085x-5T; Thu, 15 Mar 2012 22:04:42 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Hqh-0006kc-Nv; Thu, 15 Mar 2012 22:04:39 +0100 Message-ID: <4F625966.9020509@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 22:04:38 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: what's the use case of an eof-string? References: <4F61FAB0.9090208@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 21:04:51 -0000 >> eof-strings are only mentioned in the OPTIONS part, >> but there aren't any examples for this. Could some- >> body please give a real-live usecase for it? > > I cannot. But the reason why the options exists is > because of xargs compatibility. OK, so let's forget about the "real-life" aspect. Is it correct, that the eof-thing exists, to stop processing in the middle of an input file, if the magic eof-string appears? Something like this: ---8<-------------------------------------- 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 1 4 4 5 5 6 6 6 7 7 EOF a comment more comments -------------------------------------->8--- Thomas From MAILER-DAEMON Thu Mar 15 17:47:26 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8IW6-0007ob-Ts for mharc-parallel@gnu.org; Thu, 15 Mar 2012 17:47:26 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41394) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8IW4-0007nr-D8 for parallel@gnu.org; Thu, 15 Mar 2012 17:47:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8IW2-0000b9-KK for parallel@gnu.org; Thu, 15 Mar 2012 17:47:23 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:49654 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8IW2-0000aO-E4 for parallel@gnu.org; Thu, 15 Mar 2012 17:47:22 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8IVz-0000wC-QU for parallel@gnu.org; Thu, 15 Mar 2012 22:47:19 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8IVv-0002Gv-AQ for parallel@gnu.org; Thu, 15 Mar 2012 22:47:15 +0100 Message-ID: <4F626362.4000306@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 22:47:14 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: transfer and NFS homes X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 21:47:25 -0000 OK, here is how I (nearly) killed my cluster: -- Story --------------------------------------------------- Trying to see GNU parallel in action, I decided to repack collectl's logfiles. On my system they grow until about 700-900MB (raw) per day which becomes about 150MB (gziped). First I put them into a scratch dir and unpacked them. I know that it would have been possible to unpack/repack them in only one step, I just wanted the machine also to have some big (data-)files to be transfered. :-) Then I starte GNU parallel to use five 32-core machines to pack these raw files. As they were in a local scratch directory, I "had" to transfer them to the compute nodes. And there was my mistake: I used a relative path to the files. (OK, I'd need to say that the five compute nodes all have local scratch dirs but also share homes via NFS.) And there we are: The uncompressed logfiles were transfered to the compute nodes and placed in the NFS home dir. In other words: The files were in fact sent back to the head node. All six machines (headnode and compute nodes) became unusable quite soon. I guess the nodes cached the data for a while, so all five machines had huge buffers to feed NFS. :-) To bring a long story to an end: Killing parallel and rsync did not help, the headnodes nfsd's were still very busy. I waited several minutes, the headnodes load was still increasing and the nodes were unusable, too. I had to hard reset the nodes to get the headnode back. -- Question ------------------------------------------------ As I asked before in "issues with --load": Shouldn't we take more care that (not-so-experienced) users do not overload their machines by accident? In this case: Shouldn't GNU parallel detect a situation like this ("transfer to NFS homes") and exit with an error? Thomas From MAILER-DAEMON Thu Mar 15 18:03:51 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8Ily-0007d3-WC for mharc-parallel@gnu.org; Thu, 15 Mar 2012 18:03:50 -0400 Received: from eggs.gnu.org ([208.118.235.92]:48205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Ilw-0007aN-Bo for parallel@gnu.org; Thu, 15 Mar 2012 18:03:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Ilu-0003jQ-PN for parallel@gnu.org; Thu, 15 Mar 2012 18:03:47 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:49882 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8IlX-0003gU-Qv; Thu, 15 Mar 2012 18:03:23 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8IlV-0001NN-2j; Thu, 15 Mar 2012 23:03:21 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8IlT-0001bY-Cx; Thu, 15 Mar 2012 23:03:19 +0100 Message-ID: <4F626726.7060905@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 23:03:18 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: manpage: --delimiter References: <4F61F746.4040704@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 22:03:50 -0000 >> Could someone please enlight me what's the usecase of --delimiter? > > I cannot. --delimiter is there for xargs compatibility. > > The text is heavily inspired by the man page for xargs. OK, I just put the two manpages next to each other and now I understand the meaning of "is heavily inspired by". ;-) Let me change my question: What is, from your point of view, the difference between "--colsep" and "--delimiter"? Or is 'colsep' "just an enhanced version" of 'delimiter', that trims the items and uses Perl Regular Expression instead of simple characters? Thomas From MAILER-DAEMON Thu Mar 15 18:11:09 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8It3-0006b8-Ie for mharc-parallel@gnu.org; Thu, 15 Mar 2012 18:11:09 -0400 Received: from eggs.gnu.org ([208.118.235.92]:39183) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8It0-0006WS-S2 for parallel@gnu.org; Thu, 15 Mar 2012 18:11:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8Isz-000583-E2 for parallel@gnu.org; Thu, 15 Mar 2012 18:11:06 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:50003 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8Isv-000571-7u; Thu, 15 Mar 2012 18:11:01 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Iss-0001aP-NL; Thu, 15 Mar 2012 23:10:58 +0100 Received: from p54b0a1f7.dip.t-dialin.net ([84.176.161.247] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8Isn-0003aC-VY; Thu, 15 Mar 2012 23:10:54 +0100 Message-ID: <4F6268ED.3020003@med.uni-frankfurt.de> Date: Thu, 15 Mar 2012 23:10:53 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: "Support [...] is limited and may fail" References: <4F620392.7000901@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 22:11:08 -0000 > Action speaks louder than words: Instead of voting for it, please do > the investigation to figure out the limits, and why, how and when it > fails. Then write your suggestion for that section of the man page. I see. While reading the manpage I though these were already known and just not (yet) included in the manpage. And as I'm always greedy for knowledge, I tried to ask for the facts. :-) Thomas From MAILER-DAEMON Thu Mar 15 19:25:30 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8K30-0003Xk-GX for mharc-parallel@gnu.org; Thu, 15 Mar 2012 19:25:30 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50408) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8K2x-0003WF-Fn for parallel@gnu.org; Thu, 15 Mar 2012 19:25:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8K2v-0002XE-Lf for parallel@gnu.org; Thu, 15 Mar 2012 19:25:27 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:61337) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8K2v-0002X7-CL for parallel@gnu.org; Thu, 15 Mar 2012 19:25:25 -0400 Received: by dadv6 with SMTP id v6so5755108dad.0 for ; Thu, 15 Mar 2012 16:25:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=teaDP5dL6koJoDMhdCtQB5lUMgL/5qWMgpaYgtRlwZM=; b=bMuJbnfS0f/woUh9VmpNqk8le3SHOohv6Ms8fD3o/RvIVzOmIE3zLehpbCNN7r61Go sqoVDRqHu0A6rXUhuV3gO+r54fc2HN3OpPeKivqyYQ2f/0O15JeyXv1HG6K0bB8s6qTm I1uWpl3KmrFpmISUxSqiv01cR0u6UKGu9pMKE0PCJW3UyoqbXve1XcfVRIQp9hfYxtwB igjJZfNehXnfzJJVlud5hYXxpO5Lhtkt8BJBWW7yD48bFleissmgdXPM09S+TLI+kqa/ HJ/MKJLapp7IWaTnZUG1HY0wOIzicZ2sliQUkFLm8OwPpy0g7YBmFz0d4X/oUjzL3UAm Cg9w== Received: by 10.68.241.2 with SMTP id we2mr8686846pbc.53.1331853923370; Thu, 15 Mar 2012 16:25:23 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 16:25:03 -0700 (PDT) In-Reply-To: <4F626726.7060905@med.uni-frankfurt.de> References: <4F61F746.4040704@med.uni-frankfurt.de> <4F626726.7060905@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 16 Mar 2012 00:25:03 +0100 X-Google-Sender-Auth: MqoNct4svyG5SsGgGH8RG36dk0E Message-ID: Subject: Re: manpage: --delimiter To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 23:25:28 -0000 On Thu, Mar 15, 2012 at 11:03 PM, Thomas Sattler wrote: > Let me change my question: What is, from your point of view, > the difference between "--colsep" and "--delimiter"? > > Or is 'colsep' "just an enhanced version" of 'delimiter', > that trims the items and uses Perl Regular Expression > instead of simple characters? echo a-b-c,d-e-f,g-h-i | parallel -d , --colsep - echo {3}/{2}/{1} Colsep is a field separator. Delimiter is a record separator. /Ole From MAILER-DAEMON Thu Mar 15 19:27:55 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8K5L-0004Yr-RN for mharc-parallel@gnu.org; Thu, 15 Mar 2012 19:27:55 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50868) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8K5J-0004Xb-LS for parallel@gnu.org; Thu, 15 Mar 2012 19:27:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8K4z-0002wF-Ql for parallel@gnu.org; Thu, 15 Mar 2012 19:27:53 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:55041) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8K4z-0002vp-IM for parallel@gnu.org; Thu, 15 Mar 2012 19:27:33 -0400 Received: by dadv6 with SMTP id v6so5757355dad.0 for ; Thu, 15 Mar 2012 16:27:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=E+6RztplMskrvktX7A+L5ld4g/wu/VdzWFnSsFTaHXA=; b=RL8fV4+6OczCRs4SCRUmo6arxZwZmQ1ur+U5HJHM6URnJQ/aWKsQZNn9w+cborBqf4 J50ZOLfyklBjVol2pDaFeTyCuXMGH5/LjvUHiXSl16uRb8vfi1FZ8V4onmDhzGqEDKDn uFkgSmSHzXfpUhnU68CsZcj+3Sh0tXumUB92N2lXNJ9haWUKYGUTKSyWN7fOcze5KVX1 Zho1TGp78cvU2IMEjZKXwCLd75flOcgbsjFtFFz6dKBU7j7ZuDsH1OmCCgA8VRAaMcOW p4W/gHOmRNad3ePbRl/K6Csw0RqXgjIr66ptPbanIRnmzEGZQK1W/b5ym+RexUIT2Sqm DnQA== Received: by 10.68.238.1 with SMTP id vg1mr8930580pbc.33.1331854051590; Thu, 15 Mar 2012 16:27:31 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 16:27:11 -0700 (PDT) In-Reply-To: <4F6255CF.70507@med.uni-frankfurt.de> References: <4F62023A.8090900@med.uni-frankfurt.de> <4F6255CF.70507@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 16 Mar 2012 00:27:11 +0100 X-Google-Sender-Auth: g0sgqujQyfICErieyykOlNXzV5M Message-ID: Subject: Re: issues with --load To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 23:27:55 -0000 On Thu, Mar 15, 2012 at 9:49 PM, Thomas Sattler wrote: > That was not what I wanted. Yes, an experienced user can easily create > a script that does this, but my intension is something quite different: > > GNU parallel does a great job in putting high load on a system. And > my question was: Shouldn't we take more care that not-so-experienced > users do not overload their machines by accident. man niceload /Ole From MAILER-DAEMON Thu Mar 15 19:59:07 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8KZX-0001iq-6x for mharc-parallel@gnu.org; Thu, 15 Mar 2012 19:59:07 -0400 Received: from eggs.gnu.org ([208.118.235.92]:40468) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8KZU-0001hK-IH for parallel@gnu.org; Thu, 15 Mar 2012 19:59:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8KZS-0008Ui-OG for parallel@gnu.org; Thu, 15 Mar 2012 19:59:04 -0400 Received: from mail-pb0-f41.google.com ([209.85.160.41]:55764) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8KZS-0008Ue-F9 for parallel@gnu.org; Thu, 15 Mar 2012 19:59:02 -0400 Received: by pbcup15 with SMTP id up15so103203pbc.0 for ; Thu, 15 Mar 2012 16:59:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=1Fk9kiCjSc1YCcgLzn53rMwx/mro3V4UOWhRbb48ckI=; b=YPfdc7uCI9vYsWl85ml1G4F3JNzdXDmXbp9q5s+3Cqrs1TBuRsA6Tl6KgC9dteoouu ixFiXY5QIeTf7Yci2ou7tNNF34iI3ZJFOGHkn6QSn3zSIlZ76wXr8YZ11G3GpBUnbHoo yW9RHA1NWw81UB9eJU6836X4KcHwKy1HypPMeleq87cWGcfaQZ7UgbynGg7KgeRtDyGp h67HeFHdLP/cD+jBVCn8vOXnH1meY/0cqXb2he2x9CsvH18VGXMUb99oXifBROld2tdT jC9annSgdgmwVY4yxXQtH7UPQb9TYOGzA92vBONBNseY0Z5wi3F5raUt1cVsq0K+10/0 cSUA== Received: by 10.68.230.41 with SMTP id sv9mr8859333pbc.48.1331855939954; Thu, 15 Mar 2012 16:58:59 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 16:58:39 -0700 (PDT) In-Reply-To: <4F626362.4000306@med.uni-frankfurt.de> References: <4F626362.4000306@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 16 Mar 2012 00:58:39 +0100 X-Google-Sender-Auth: _xkd2GOiLZCBuhA6XQRG6NLP1aU Message-ID: Subject: Re: transfer and NFS homes To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 23:59:06 -0000 On Thu, Mar 15, 2012 at 10:47 PM, Thomas Sattler wrote: > OK, here is how I (nearly) killed my cluster: I like the sound of it: GNU Parallel - the cluster killer! > As I asked before in "issues with --load": Shouldn't we take > more care that (not-so-experienced) users do not overload > their machines by accident? It is extremely hard to tell the difference between a power user maxing out a server and a novice doing something that overloads the server. I have several times used GNU Parallel causing a cpu load of > 1000, because it was faster to complete my task that way. GNU Parallel is made for power users and that is its primary goal. If novices can use it aswell, then that is fine, but GNU Parallel will not shield beginners against mistakes if that makes it harder to use for power users. That does not mean that I will not implement functionality that will help beginners. Case in point: The warning message you get when entering 'parallel' with no input or arguments is due to a beginner being confused by nothing happening. > In this case: Shouldn't GNU parallel detect a situation like > this ("transfer to NFS homes") and exit with an error? Definitely no. I use multiple systems, some have nfs-homes and I want to be able to --transfer to those. /Ole From MAILER-DAEMON Thu Mar 15 20:33:18 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8L6c-0000Ac-MV for mharc-parallel@gnu.org; Thu, 15 Mar 2012 20:33:18 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8L6a-00009Q-DA for parallel@gnu.org; Thu, 15 Mar 2012 20:33:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8L6X-00063F-Lc for parallel@gnu.org; Thu, 15 Mar 2012 20:33:15 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:60697) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8L6X-00062y-D3 for parallel@gnu.org; Thu, 15 Mar 2012 20:33:13 -0400 Received: by dadv6 with SMTP id v6so5827157dad.0 for ; Thu, 15 Mar 2012 17:33:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=yNz3xpMqHXaioPFDl2jn7Zx3tiv78a2uV4MO8z9mwFs=; b=e6rGSU12OY668ygWF7+EGXZSREHA1bGVBvX+rUdTCZuk8E5JljOxWxvUAlnyHC+c6c 8Hy7JUI95h64EZ11+KbH8+G6IlSOXCxOuDP6j1/OGcM1jbS1ZL8yHp45H7IQ5yDTn3yZ mgY/LaKqFEABGyNbwgJTFlzgS+MpYoWxdi7tGmUlQe5/kWKfUMrSbC9wVlq8Hipj1IUa s/qX748dwSR9K7rU0lHPVPfkMXAOWs1qI9snTqNOUBlIwJMRHjk6tIx3E8TeJ7knQktk m5268fInoJxEmYEwTXZQjcRWiAegD4WUF3F9hKE407aW5Aj0yr71KnkEeXhY8+Otb2hF g67g== Received: by 10.68.201.37 with SMTP id jx5mr1365322pbc.75.1331857986800; Thu, 15 Mar 2012 17:33:06 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.157.8 with HTTP; Thu, 15 Mar 2012 17:32:46 -0700 (PDT) From: Ole Tange Date: Fri, 16 Mar 2012 01:32:46 +0100 X-Google-Sender-Auth: 9gOvSKaQRV8WXZBDLQdykerEflw Message-ID: Subject: Slow start to cope with load To: parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2012 00:33:17 -0000 Thomas got me thinking. One of the problems with --load is that it only limits how many jobs are started. So you may start way too many. This will give you a load of 100: seq 100 | nice parallel -j0 --load 2.00 burnP6 and that is most likely not what you want. While some programs run multiple threads (and thus can give a load > 1 each) that is the exception. So in general I think we can assume one job will at most give a load of 1. Currently load is only computed every 10 seconds. So we could recompute every 10 seconds: number_of_concurrent_jobs = max_load - current_load + number_of_concurrent_jobs If the job immediately takes 100% CPU time (like burnP6) then the number of processes will grow every 10 seconds with the difference between current load and max load. As the load lags behind it may cause us to spawn too many processes that will cause a load > max load. But when the jobs finish the the load will over time drop to the max load. If the job never takes 100% CPU time (like host) then the number of processes will grow every 10 seconds with the difference between current load and max load. If the job takes 100% CPU time after some initialization (like blast) then the number of processes will grow every 10 seconds with the difference between current load and max load. The current load will start out small, this may cause us to spawn too many processes that will cause a load > max load. If the job takes >100% CPU time after some initialization (like multithreaded blast) then the number of processes will grow every 10 seconds with the difference between current load and max load. The current load will start out small, this may cause us to spawn too many processes that will cause a load > max load. I believe it would be better than the current, but I am very open to even better ideas. /Ole From MAILER-DAEMON Fri Mar 16 07:34:22 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8VQM-0005l1-3A for mharc-parallel@gnu.org; Fri, 16 Mar 2012 07:34:22 -0400 Received: from eggs.gnu.org ([208.118.235.92]:59236) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8VQE-0005im-3q for parallel@gnu.org; Fri, 16 Mar 2012 07:34:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8VQ8-0000u7-0E for parallel@gnu.org; Fri, 16 Mar 2012 07:34:13 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:43557 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8VPy-0000sC-DD; Fri, 16 Mar 2012 07:33:58 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8VPu-0001t2-DR; Fri, 16 Mar 2012 12:33:54 +0100 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1S8VPs-0006gg-TQ; Fri, 16 Mar 2012 12:33:53 +0100 Message-ID: <4F632520.6030303@med.uni-frankfurt.de> Date: Fri, 16 Mar 2012 12:33:52 +0100 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: transfer and NFS homes References: <4F626362.4000306@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2012 11:34:19 -0000 > It is extremely hard to tell the difference between a power user > maxing out a server and a novice doing something that overloads the > server. I have several times used GNU Parallel causing a cpu load of > > 1000, because it was faster to complete my task that way. I'm curious about that: What kind of job is working better on an overloaded system? To my knowledge reaching 100% load (not to be mixed with n jobs in parallel on an n-core system) is the best to be done. > GNU Parallel is made for power users and that is its primary goal. If > novices can use it aswell, then that is fine, but GNU Parallel will > not shield beginners against mistakes if that makes it harder to use > for power users. I understand (and even advocate) that. I really hate the idea of an "rm='rm -i'" shell alias, that can be seen on so many instal- lations today. (In fact we have it here and I just don't dare to remove it, as people might got used to it and might start crying in case 'rm' would do what it had been written for.) I just hoped that there was a way to make PNU Parallel a bit more failsafe, without making it harder for power users. >> In this case: Shouldn't GNU parallel detect a situation like >> this ("transfer to NFS homes") and exit with an error? > > Definitely no. I use multiple systems, some have nfs-homes and I > want to be able to --transfer to those. I see, so transfer to NFS even is a desired usecase for you. OK. For the records: I decided to preset PARALLEL on our hosts to prevent accidently damage in the future: PARALLEL='--load 100% --nice 10 --noswap --workdir /scratch' ("--load 100%" and "--noswap" might be kind of redundant here) Thanks Thomas From MAILER-DAEMON Fri Mar 16 16:00:22 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8dK2-0005C2-Hz for mharc-parallel@gnu.org; Fri, 16 Mar 2012 16:00:22 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50385) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8dJw-00059L-Py for parallel@gnu.org; Fri, 16 Mar 2012 16:00:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8dJv-0002dT-07 for parallel@gnu.org; Fri, 16 Mar 2012 16:00:16 -0400 Received: from mail-vb0-f41.google.com ([209.85.212.41]:41665) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8dJu-0002dJ-PT for parallel@gnu.org; Fri, 16 Mar 2012 16:00:14 -0400 Received: by vbbey12 with SMTP id ey12so255630vbb.0 for ; Fri, 16 Mar 2012 13:00:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Uv89q5u7cStdMu432gw/4p8eeq44czCE0dvyivBdpcE=; b=Xh89RTCk62vrA2zvEnJes5JvFVQaRIrBcbB0MzfM0kkL7TH19YZqXAD4fUiG2vICBT LTXEsqFDalQ9rnoodq/cPUrfpWXYqDa9+hZyhnrU+TXpegeWvGr5pDRWMlU1fL3VwIbz dI0MRXBkdz21XP2Ne/ZwinKm3WoQxYTZBSnZTPss+pDKKB9LrbfPkTxQfowVf4+YL/nr alNB06BQMtsE02X+2DYZYn8orL//HptVv7o1Pf63BItwYpr6cAJo/ZhOe6a10+fToAF0 EwKZVQLeiT0XdsKNWFmMu6GUMEGMzZfnEHc2njmwDUQnt2e/eL1YUlE651YDjsSL4GN+ hyyg== MIME-Version: 1.0 Received: by 10.52.179.35 with SMTP id dd3mr2015727vdc.2.1331928012206; Fri, 16 Mar 2012 13:00:12 -0700 (PDT) Received: by 10.52.156.45 with HTTP; Fri, 16 Mar 2012 13:00:12 -0700 (PDT) Date: Fri, 16 Mar 2012 21:00:12 +0100 Message-ID: Subject: Feature request - {..} remove two level of extensions From: =?ISO-8859-1?Q?Martin_M=F8ller_Skarbiniks_Pedersen?= To: parallel@gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.212.41 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2012 20:00:21 -0000 Hi, I have a feature request for GNU parallel. The option could be {..} or maybe {.2} and should remove the extension from a filename two times eg. $ ls track07.cdda.wav | parallel echo {..}.flac track07.flac I could use with when I extract files from a music disc with cdparanoia and then encode the wav-files with flac. Today I do this: $ ls track*wav track07.cdda.wav track10.cdda.wav track13.cdda.wav track08.cdda.wav track11.cdda.wav track14.cdda.wav track09.cdda.wav track12.cdda.wav track15.cdda.wav $ ls *wav | parallel -n 1 flac {} -o {.}.flac [...] $ ls track*flac track07.cdda.flac track10.cdda.flac track13.cdda.flac track08.cdda.flac track11.cdda.flac track14.cdda.flac track09.cdda.flac track12.cdda.flac track15.cdda.flac $ ls *flac | parallel -n 1 mv {} {.} $ ls *cdda | parallel -n 1 mv {} {.}.flac Regards Martin --=20 Til uvedkommende, der l=C3=A6ser med: Der er ingen grund til at l=C3=A6se m= in mail. Jeg har intet at g=C3=B8re med FARC, al-Jihad, al-Qaida, Hamas, Hizb al-Mujahidin eller ETA. Jeg har aldrig gjort Zakat, g=C3=A5r ikke ind for Istishad, har ikke lavet en bilbombe eller kernev=C3=A5ben og jeg ved d=C3=A5rligt nok, hvad Al Manar og =D0=B1=D0=BE=D0=BC=D0=B1=D0=B0 betyder. = =C2=A0Men tak for den udviste interesse. Leve Ligemageriet! Styrk p=C3=B8belv=C3=A6ldet! Bevar misundelsesafgifterne og cafepengene! Hurra for =C3=A6ldrebyrden! From MAILER-DAEMON Fri Mar 16 18:49:21 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S8fxZ-0004dA-FK for mharc-parallel@gnu.org; Fri, 16 Mar 2012 18:49:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45477) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8fxW-0004cm-Tx for parallel@gnu.org; Fri, 16 Mar 2012 18:49:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S8fxV-0001zM-1g for parallel@gnu.org; Fri, 16 Mar 2012 18:49:18 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:61109) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S8frH-00015J-RJ for parallel@gnu.org; Fri, 16 Mar 2012 18:42:52 -0400 Received: by dadv6 with SMTP id v6so7356151dad.0 for ; Fri, 16 Mar 2012 15:42:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=SNqR9rrZrR+L3ffQpNzpLggWW4y/5pMNTKOUQIG2fHY=; b=IPzsneK+Tx23S27dE2O2YH0dYhju39N8Q+/411DVwDFwyz2Ad3AHCnM/y7XWLDoxSP BV4g/QlZSnoVcFIhQdQjefQYjUoBVvzzlJ6CxYCesHpwXu1bIHX63o/iPtdl+dnxzdqc b/wHBbMcBf8Fn1FNKEelgVZH+HZpOxqLfv3pqACfry6s4JZWE1tan3mAyIM/sozBeLq6 7XL7qVPR22RVa2qtJzM1ihn2ez/NtmkdjbPIrAoJZH/QUBjgD00qxPxtcUbFPIR7O6iU gC4UicDH3cZCuS5pmhiABBv4ZG+tKuAtKvKHndOeGMJaFVIx8hSRDluacXbdWyJtekXS I5Ag== Received: by 10.68.216.98 with SMTP id op2mr18744644pbc.93.1331937768826; Fri, 16 Mar 2012 15:42:48 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.169.8 with HTTP; Fri, 16 Mar 2012 15:42:28 -0700 (PDT) In-Reply-To: <4F632520.6030303@med.uni-frankfurt.de> References: <4F626362.4000306@med.uni-frankfurt.de> <4F632520.6030303@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 16 Mar 2012 23:42:28 +0100 X-Google-Sender-Auth: fbf4xlOoH06cYWEiOu0qX_BaEQc Message-ID: Subject: Re: transfer and NFS homes To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2012 22:49:20 -0000 On Fri, Mar 16, 2012 at 12:33 PM, Thomas Sattler wrote: >> It is extremely hard to tell the difference between a power user >> maxing out a server and a novice doing something that overloads the >> server. I have several times used GNU Parallel causing a cpu load of > >> 1000, because it was faster to complete my task that way. > > I'm curious about that: What kind of job is working better on an > overloaded system? To my knowledge reaching 100% load (not to be > mixed with n jobs in parallel on an n-core system) is the best > to be done. "Best" depends on how you measure. In my case I used -j0 because the total time (writing the command + executing it) would be minimized that way. If I had optimized for execution time I could probably have found a smaller number of jobs that would have executed marginally faster, but it would have taken time to find this number, and thus the total time spend would have been longer. Each job did a mix of disk I/O, network I/O and CPU in a hard to predict manner. /Ole From MAILER-DAEMON Mon Mar 19 06:26:51 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S9Znf-0006Oj-EV for mharc-parallel@gnu.org; Mon, 19 Mar 2012 06:26:51 -0400 Received: from eggs.gnu.org ([208.118.235.92]:40948) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9ZnC-0006It-FO for parallel@gnu.org; Mon, 19 Mar 2012 06:26:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9Zn5-0004MH-PJ for parallel@gnu.org; Mon, 19 Mar 2012 06:26:22 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:56991) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9Zn5-0004M4-De for parallel@gnu.org; Mon, 19 Mar 2012 06:26:15 -0400 Received: by dadv6 with SMTP id v6so10390953dad.0 for ; Mon, 19 Mar 2012 03:26:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=I818EgZfRYbkpg/BLgqCg1BY3IpniHfryGqocQiivok=; b=dnwQ6WfjACPO0W/ZECo5Kvx6eU7LDHKKtEsNVi06VcEt7KuPEnLplKSM8C9CSnf+M2 0LVxfvZJlBV6T+tpdTe44N5+0istV0kZlOSA4i76q9w5ouZpHMRZopVgXzP6P6xVij8l MkpAmKspPPejYrTp/2Hiak1Ty9WyBtiu96Jghnm1uMBnNLKQgo87hOhrIl3BH9i94xSn luNFn/V9i8rGPCIrzlLgQPsQQQMb//CA62Pfjmy/+0NepuwDjrK+1I9XnQm7+yhQRL+D 5mtKCo0e6uzX8azgev0VT629YBcEWSvyuoB+pCrdN7RsI1dpHL69nCtw0u0SI815YrpD emVw== Received: by 10.68.74.97 with SMTP id s1mr38613933pbv.46.1332152772413; Mon, 19 Mar 2012 03:26:12 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.169.8 with HTTP; Mon, 19 Mar 2012 03:25:52 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Mon, 19 Mar 2012 11:25:52 +0100 X-Google-Sender-Auth: nsObMNCWFFAQIAMshHpU5N37FKI Message-ID: Subject: Re: Slow start to cope with load To: "Matt Oates (Home)" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Mar 2012 10:26:50 -0000 On Mon, Mar 19, 2012 at 10:20 AM, Matt Oates (Home) w= rote: > > On 16 March 2012 00:32, Ole Tange wrote: > > One of the problems with --load is that it only limits how many jobs > > are started. So you may start way too many. This will give you a load > > of 100: > > > > =A0seq 100 | nice parallel -j0 --load 2.00 burnP6 > > > > and that is most likely not what you want. > > Am I wrong in thinking you can just do -j 100% so that you never spawn > more than maxload processes assuming one process load 1.0 on a single > core? Can you not use -j 100% in conjunction with --load to prevent > the overload on startup? For CPU hungry programs like 'burnP6' that would be true. But if the program only uses 10% CPU (because it is waiting for network or disk I/O), then we should be able to spawn more - preferably automatically figuring out the "right" amount. > > While some programs run multiple threads (and thus can give a load > 1 > > each) that is the exception. So in general I think we can assume one > > job will at most give a load of 1. > > It would be nice to explicitly state the likely load per process > though especially if you are the one setting it. I frequently run hmm > building with concurrent threading per process and just do the maths > myself, and am lucky that all the hosts have the same number of CPUs. > Perhaps a flag like --is-threaded=3D4 =A0or something to indicate the > likely load per job? I am not too happy about that. I would much prefer some automated way of doing-the-right-thing. > > Currently load is only computed every 10 seconds. So we could > > recompute every 10 seconds: > > > > =A0 =A0number_of_concurrent_jobs =3D max_load - current_load + > > number_of_concurrent_jobs > > Looks good, though I have a couple of questions: If this is negative > are you going to kill processes rather than start them? What if it's > always 0 even from the start are you just never going to run on this > host? As a user I would be very surprised if GNU Parallel started to kill my jobs, and I try to design GNU Parallel adherring to POLA: http://en.wikipedia.org/wiki/Principle_of_least_astonishment So if it is < 1 it would mean: Do not spawn more new jobs, but wait for jobs to complete. > > I believe it would be better than the current, but I am very open to > > even better ideas. > > You are starting to get into the realm of needing to understand > scheduling per host... Load might be reported for something with a > different nice value than what you want to submit. So 100% load for > something with <0 nice and you want to put something in for +19. In > your equation above I would just add in something looking at the > difference between parallel's jobs that are running and those that are > ready/waiting. If all our jobs are running even under high load who > cares, we have priority here so keep up with the max load. If half of > our jobs are waiting then we might as well reduce spawning by half. I did not understand this part. > Best, > Matt. /Ole From MAILER-DAEMON Mon Mar 19 07:28:40 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S9alU-0001XA-O1 for mharc-parallel@gnu.org; Mon, 19 Mar 2012 07:28:40 -0400 Received: from eggs.gnu.org ([208.118.235.92]:53685) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9alO-0001WJ-BR for parallel@gnu.org; Mon, 19 Mar 2012 07:28:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9akz-0006ia-FZ for parallel@gnu.org; Mon, 19 Mar 2012 07:28:33 -0400 Received: from mail-yx0-f169.google.com ([209.85.213.169]:36823) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9akz-0006iS-6z; Mon, 19 Mar 2012 07:28:09 -0400 Received: by yenm8 with SMTP id m8so6217144yen.0 for ; Mon, 19 Mar 2012 04:28:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=4IjTxO1qgU8x0mFHVoRISdzsded4Lj3m41oFKHryaNs=; b=US2zHWUvqq25ji27W6wRdKoKjDguTm4X4eUxiV6nmK1vgxDYSab54nnR0tPpFytY4k YpCnkcnPExXK/ayZmxJqyPpjBDnw6aA308uurP9NsX+z5zE0m2L8BkYDqfckGlRkp3ZW 9u5/HMb0Cn+zDsa/bnG+mJnKYByfKNfCnK3QfT0lX5NKZovoQOZ6gIWUWhbJmnmLnS0r cb4+k+9P+bgALQ5NyunaO2mXWU3mpm1clTrJ9Q57hxBkIgTdWMvE5O6Y2jIz9cWJyLc5 PVxkLTz+slCedNow2fgpMywlkL9goxZ5axhQw67A+3Moqh7NPFVkTgHBiQXZ0mnXrXO/ kp5Q== Received: by 10.236.192.138 with SMTP id i10mr12400090yhn.3.1332156486778; Mon, 19 Mar 2012 04:28:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.106.7 with HTTP; Mon, 19 Mar 2012 04:27:46 -0700 (PDT) In-Reply-To: References: From: "Matt Oates (Home)" Date: Mon, 19 Mar 2012 11:27:46 +0000 Message-ID: Subject: Re: Slow start to cope with load To: Ole Tange Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.213.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Mar 2012 11:28:39 -0000 On 19 March 2012 10:25, Ole Tange wrote: > On Mon, Mar 19, 2012 at 10:20 AM, Matt Oates (Home) = wrote: >> Am I wrong in thinking you can just do -j 100% so that you never spawn >> more than maxload processes assuming one process load 1.0 on a single >> core? Can you not use -j 100% in conjunction with --load to prevent >> the overload on startup? > > For CPU hungry programs like 'burnP6' that would be true. But if the > program only uses 10% CPU (because it is waiting for network or disk > I/O), then we should be able to spawn more - preferably automatically > figuring out the "right" amount. If it is low because of blocking spawning more jobs isn't going to help the wait on IO. >> > While some programs run multiple threads (and thus can give a load > 1 >> > each) that is the exception. So in general I think we can assume one >> > job will at most give a load of 1. >> >> It would be nice to explicitly state the likely load per process >> though especially if you are the one setting it. I frequently run hmm >> building with concurrent threading per process and just do the maths >> myself, and am lucky that all the hosts have the same number of CPUs. >> Perhaps a flag like --is-threaded=3D4 =C2=A0or something to indicate the >> likely load per job? > > I am not too happy about that. I would much prefer some automated way > of doing-the-right-thing. If I'm already setting this manually though why do the right thing automatically when I know what the right thing to do is. I agree having parallel throttle automatically as normal is best. But it would be nice to explicitly state what you know if you are already specifying it in the job. >> Looks good, though I have a couple of questions: If this is negative >> are you going to kill processes rather than start them? What if it's >> always 0 even from the start are you just never going to run on this >> host? > > As a user I would be very surprised if GNU Parallel started to kill my > jobs, and I try to design GNU Parallel adherring to POLA: > http://en.wikipedia.org/wiki/Principle_of_least_astonishment > > So if it is < 1 it would mean: Do not spawn more new jobs, but wait > for jobs to complete. Great that's what I wanted to hear :) I already have problems with the kernel process killer hitting my jobs when someone else submits a big job, it would be really lame if my job killed itself too. >> > I believe it would be better than the current, but I am very open to >> > even better ideas. >> >> You are starting to get into the realm of needing to understand >> scheduling per host... Load might be reported for something with a >> different nice value than what you want to submit. So 100% load for >> something with <0 nice and you want to put something in for +19. In >> your equation above I would just add in something looking at the >> difference between parallel's jobs that are running and those that are >> ready/waiting. If all our jobs are running even under high load who >> cares, we have priority here so keep up with the max load. If half of >> our jobs are waiting then we might as well reduce spawning by half. > > I did not understand this part. Two points: 1.) You can have high but very low priority load. In this case we want a high priority job to ignore the load because it can replace it completely. For example updatedb is usually low nice value, when we come along with our job it doesn't matter if there is high load since we will knock updatedb off of the scheduling queue. 2.) You can take into account priority by just including what percentage of our jobs are in the "running" process state rather than "ready" or "waiting" state. So if there is high load and we put in 100 processes and all of them are running, it's fine... if only 1 is running and the rest are just waiting then we should alter appropriately to that ratio until you find a natural size on the host machine. Hope thats a bit more clear? It just means adjusting your equation to something like: number_of_concurrent_jobs =3D max_load - current_load + (number_of_concurrent_jobs - number_of_concurrent_jobs_in_wait_state / 2) That way you quickly converge on the number of processes that can run, I'd ignore those that are blocked on IO, just negate the ones that are literally waiting on CPU. Best, Matt. From MAILER-DAEMON Mon Mar 19 13:30:42 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S9gPq-0006Gp-E3 for mharc-parallel@gnu.org; Mon, 19 Mar 2012 13:30:42 -0400 Received: from eggs.gnu.org ([208.118.235.92]:47675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9gPi-0006FR-So for parallel@gnu.org; Mon, 19 Mar 2012 13:30:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9gPb-0003v5-0A for parallel@gnu.org; Mon, 19 Mar 2012 13:30:34 -0400 Received: from mail-pz0-f41.google.com ([209.85.210.41]:40720) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9gPa-0003tp-J3 for parallel@gnu.org; Mon, 19 Mar 2012 13:30:26 -0400 Received: by dadv6 with SMTP id v6so10913624dad.0 for ; Mon, 19 Mar 2012 10:30:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=acSibNe6x2KGIg0h8CiCKU3A2/jPG0FraJHGhItuzSw=; b=Sn43lgSyP0ufIHV/GwGkMzcWOy/LMTx2PhQ3dXHVuxxN9mEfRIozUCAid7DyidgNc9 uQeowyhV7bO9MnTaVoeXVoa1Udr3lcDitMwthFNw8S0IAHOATltKwfIeqnnK6JJ/6MF+ StK4vBqsUNLbyDteqvGRVqXPZbFApfuw1zu2Is20r5Rco6COaki1accI70yu93CZzpnw ZjsNIPuCEc2ukh2M1SeKtQu1TqA8Fn540D6m+GYbAvN01K74sOYaRVKYfXMdXHc28s5D d4fdivsk708aQfts+Y8z4EBnKfBR8eCvpJhCym9Y0eqQX8rgsT7tl1DElTnGeEwUdwr1 yWlQ== Received: by 10.68.216.98 with SMTP id op2mr42499513pbc.93.1332178224521; Mon, 19 Mar 2012 10:30:24 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.200.11 with HTTP; Mon, 19 Mar 2012 10:30:04 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Mon, 19 Mar 2012 18:30:04 +0100 X-Google-Sender-Auth: 1j95hlaL0oyWnJ7tMSXERsExQF8 Message-ID: Subject: Re: Slow start to cope with load To: "Matt Oates (Home)" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Mar 2012 17:30:41 -0000 On Mon, Mar 19, 2012 at 12:27 PM, Matt Oates (Home) w= rote: > On 19 March 2012 10:25, Ole Tange wrote: >> On Mon, Mar 19, 2012 at 10:20 AM, Matt Oates (Home) wrote: >>> Am I wrong in thinking you can just do -j 100% so that you never spawn >>> more than maxload processes assuming one process load 1.0 on a single >>> core? Can you not use -j 100% in conjunction with --load to prevent >>> the overload on startup? >> >> For CPU hungry programs like 'burnP6' that would be true. But if the >> program only uses 10% CPU (because it is waiting for network or disk >> I/O), then we should be able to spawn more - preferably automatically >> figuring out the "right" amount. > > If it is low because of blocking spawning more jobs isn't going to > help the wait on IO. If the I/O you are waiting for is a reply from server (which could be caused by latency) then it often makes sense to spawn more than one per CPU. >>> Perhaps a flag like --is-threaded=3D4 =A0or something to indicate the >>> likely load per job? >> >> I am not too happy about that. I would much prefer some automated way >> of doing-the-right-thing. > > If I'm already setting this manually though why do the right thing > automatically when I know what the right thing to do is. I agree > having parallel throttle automatically as normal is best. But it would > be nice to explicitly state what you know if you are already > specifying it in the job. The --is-threaded will only make sense for CPU limited jobs. So explain in which situations that these would not be equivalent: -j 100% --is-threaded=3D4 -j 25% >>> You are starting to get into the realm of needing to understand >>> scheduling per host... Load might be reported for something with a >>> different nice value than what you want to submit. So 100% load for >>> something with <0 nice and you want to put something in for +19. In >>> your equation above I would just add in something looking at the >>> difference between parallel's jobs that are running and those that are >>> ready/waiting. If all our jobs are running even under high load who >>> cares, we have priority here so keep up with the max load. If half of >>> our jobs are waiting then we might as well reduce spawning by half. >> >> I did not understand this part. > > Two points: > 1.) You can have high but very low priority load. In this case we want > a high priority job to ignore the load because it can replace it > completely. For example updatedb is usually low nice value, when we > come along with our job it doesn't matter if there is high load since > we will knock updatedb off of the scheduling queue. > 2.) You can take into account priority by just including what > percentage of our jobs are in the "running" process state rather than > "ready" or "waiting" state. So if there is high load and we put in 100 > processes and all of them are running, it's fine... if only 1 is > running and the rest are just waiting then we should alter > appropriately to that ratio until you find a natural size on the host > machine. > > Hope thats a bit more clear? It just means adjusting your equation to > something like: > > number_of_concurrent_jobs =3D max_load - current_load + > (number_of_concurrent_jobs - number_of_concurrent_jobs_in_wait_state / > 2) > > That way you quickly converge on the number of processes that can run, > I'd ignore those that are blocked on IO, just negate the ones that are > literally waiting on CPU. If I understand you correctly you basically want to ignore the load average as reported by the server, but instead compute your own, where you ignore the jobs that are nicer than you are. If that is what you mean I see the following problems: * It is hard to explain what is going on (thus not adhering to Principle of Least Astonishment). * How do you determine what processes will be knocked off the scheduling qu= eue? * How do you tell that whether the job you are running is limited by disk I/O or CPU? * How do you tell if the running process is a (detatched) (grand*)child of a process started by GNU Parallel and that the parent is just waiting for the child complete? It seems like an awful lot of complexity, but I might be wrong. /Ole From MAILER-DAEMON Mon Mar 19 14:32:47 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S9hNv-00052W-Ly for mharc-parallel@gnu.org; Mon, 19 Mar 2012 14:32:47 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36753) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9hNa-0004tA-1a for Parallel@gnu.org; Mon, 19 Mar 2012 14:32:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9hNY-0006iO-3r for Parallel@gnu.org; Mon, 19 Mar 2012 14:32:25 -0400 Received: from imr-da06.mx.aol.com ([205.188.169.203]:48650) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9hNX-0006i0-Sw for Parallel@gnu.org; Mon, 19 Mar 2012 14:32:24 -0400 Received: from mtaomg-da05.r1000.mx.aol.com (mtaomg-da05.r1000.mx.aol.com [172.29.51.141]) by imr-da06.mx.aol.com (8.14.1/8.14.1) with ESMTP id q2JIW5t8003902 for ; Mon, 19 Mar 2012 14:32:05 -0400 Received: from core-drd005a.r1000.mail.aol.com (core-drd005.r1000.mail.aol.com [172.29.227.17]) by mtaomg-da05.r1000.mx.aol.com (OMAG/Core Interface) with ESMTP id D949CE00008E for ; Mon, 19 Mar 2012 14:32:04 -0400 (EDT) To: Parallel@gnu.org Subject: Re: Slow start to cope with load X-MB-Message-Source: WebUI X-MB-Message-Type: User MIME-Version: 1.0 From: David Content-Type: multipart/alternative; boundary="--------MB_8CED415B77B338D_1F1C_426ED_webmail-d071.sysops.aol.com" X-Mailer: AOL Webmail 35775-STANDARD Received: from 70.20.200.186 by webmail-d071.sysops.aol.com (205.188.167.105) with HTTP (WebMailUI); Mon, 19 Mar 2012 14:32:04 -0400 Message-Id: <8CED415B76CEB47-1F1C-1167E@webmail-d071.sysops.aol.com> X-Originating-IP: [70.20.200.186] Date: Mon, 19 Mar 2012 14:32:04 -0400 (EDT) x-aol-global-disposition: G DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mx.aol.com; s=20110426; t=1332181925; bh=2g4bVUAMo4KxKWwa+wf3cyE+4mw/fmLLdEaXJisc8Ho=; h=From:To:Subject:Message-Id:Date:MIME-Version:Content-Type; b=XaN4uKJLc1HywVikGhENo33zDe3Vw6x7yvTW1drI/0ZbcWDMysndakdYHfujDc4bf u9tqo4bdFbAwwB/G7nI67tyRX6s2yDGtBJFsAaGbBGfol9gBjoINmifoCyCdSVm/Cz FW/Mq0+NLb2Jx2h0rGV/r/8GqJWoe8z05sXZJpew= X-AOL-SCOLL-SCORE: 0:2:397325280:93952408 X-AOL-SCOLL-URL_COUNT: 0 x-aol-sid: 3039ac1d338d4f677ba44db2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 205.188.169.203 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Mar 2012 18:32:46 -0000 This is a multi-part message in MIME format. ----------MB_8CED415B77B338D_1F1C_426ED_webmail-d071.sysops.aol.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" The human way to do this might be a good model: Start N, one for each core, for example, 8, nominally hoping for 100% or le= ss busy. If the system is already 25% busy, it might be nice to try to lea= ve that intact, for example starting with 6. After a moment, review total idle time, and for say 30% idle, try 30 * N / = 70 more, for 8, 3 to make nominally 96.25% busy. Or get greedy, with 4 for= 105% busy. After another moment, if the target is not approximately reached, cap the p= arallelism at the amount the idle time indicates, maybe plus 1 (rounded up)= , for instance, for 20% idle, 8 * 80 / 70 =3D 10. If there is essentially = no increase, it might be good to lower the cap, for instance below 8. If t= he target is reached, these steps can be repeated as processing progresses,= so if CPU use drops, parallelism can be increased, and if it increases, no= more are spawned until the running count drops. Thus, 100% CPU, if usable, is utilized on the nose, middle or tail of the o= peration, wherever the lower need. At times when CPU need expands to excee= d 100%, use of other resources may drop, but that is all the CPU you bought= . If the other resources cap the parallelism, there is not point in more, = and more reduces stability and increases any rerun time if there is an inte= rruption, because fewer reach completion before. For instance, on network TCP transfers, often process 2 adds 5-10% and proc= ess 3 adds nothing, but 3 might be a nice level of parallelism, so you drop= to 2 when a process terminates or a packet is lost and never lose that 10%= , unless 2 or 3 end simultaneously. The price of process 2 is that process= 1 drops from 90% to 50% speed, and process 3 takes them all to 33%, so cho= osing between 2 for faster unit turnaround and 3 for better total bandwidth= use during job end/start or packet loss is a matter of taste and situation= . If reliability is never an issue (interruptions like network loss), overloa= ding some resource on a host with plenty of RAM does not hurt final run tim= e. There may be some loss when going past the number of cores, even if the= re is idle time, if cache hits are reduced by forcing more process changes = on each core. Added cache latency can turn into critical process latency, = if progress is somehow tied to event turnaround time, like a transfer with = insufficient buffering. -- David ----------MB_8CED415B77B338D_1F1C_426ED_webmail-d071.sysops.aol.com Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="us-ascii" The human way to do this might be a good model:
  1. Start N, one for each core, for example, 8, nominally hoping for = 100% or less busy.  If the system is already 25% busy, it might be nic= e to try to leave that intact, for example starting with 6.
  2. After a moment, review total idle time, and for say 30% i= dle, try 30 * N / 70 more, for 8, 3 to make nominally 96.25% busy.  Or= get greedy, with 4 for 105% busy.
  3. After another moment, if the target is not approximately = reached, cap the parallelism at the amount the idle time indicates, maybe p= lus 1 (rounded up), for instance, for 20% idle, 8 * 80 / 70 =3D 10.  I= f there is essentially no increase, it might be good to lower the cap, for = instance below 8.  If the target is reached, these steps can be repeat= ed as processing progresses, so if CPU use drops, parallelism can be increa= sed, and if it increases, no more are spawned until the running count drops= .
Thus, 100% CPU, if usable, is utilize= d on the nose, middle or tail of the operation, wherever the lower= need.  At times when CPU need expands to exceed 100%, use of other re= sources may drop, but that is all the CPU you bought.  If the other re= sources cap the parallelism, there is not point in more, and more reduces s= tability and increases any rerun time if there is an interruption, because = fewer reach completion before.

For instance, on network TCP transfers, often process 2 adds 5-10% and proc= ess 3 adds nothing, but 3 might be a nice level of parallelism, so you drop= to 2 when a process terminates or a packet is lost and never lose that 10%= , unless 2 or 3 end simultaneously.  The price of process 2 is that pr= ocess 1 drops from 90% to 50% speed, and process 3 takes them all to 33%, s= o choosing between 2 for faster unit turnaround and 3 for better total band= width use during job end/start or packet loss is a matter of taste and situ= ation.

If reliability is never an issue (interruptions like network loss), overloa= ding some resource on a host with plenty of RAM does not hurt final run tim= e.  There may be some loss when going past the number of cores, even i= f there is idle time, if cache hits are reduced by forcing more process cha= nges on each core.  Added cache latency can turn into critical process= latency, if progress is somehow tied to event turnaround time, like a tran= sfer with insufficient buffering.

-- David

----------MB_8CED415B77B338D_1F1C_426ED_webmail-d071.sysops.aol.com-- From MAILER-DAEMON Tue Mar 20 05:50:29 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1S9vi1-0002fG-57 for mharc-parallel@gnu.org; Tue, 20 Mar 2012 05:50:29 -0400 Received: from eggs.gnu.org ([208.118.235.92]:39851) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9vhs-0002C9-9r for parallel@gnu.org; Tue, 20 Mar 2012 05:50:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S9vhk-0004Hf-Rq for parallel@gnu.org; Tue, 20 Mar 2012 05:50:19 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:34938) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S9vhk-0004Gu-LR; Tue, 20 Mar 2012 05:50:12 -0400 Received: by ggeq1 with SMTP id q1so7171935gge.0 for ; Tue, 20 Mar 2012 02:50:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=Bwv90mQZzXoRf7t2puLCUMpe9NO/fCtTa6D+eL+k9Pw=; b=tKgflApQWTlkHGeOx88gyV0OdrjCKmKSVQykgwZsZZsIb6fRZ4iiYWsx3hpRJmcgVt LO7lnRP3IhhH7F90gJdU9JtcCPmK+akBHPf+gGKMZnhZ92h5fURwPLS93WaoGlLowc0d PnZqhQBQE2NTnOTlPpGPfByAAUI5btXHW8ZdNUcQJxSF8ze895ISBbnXdC61RIvaoCUn mVkn0JbCseoWV0CPk7vfFav7LaHdJzaQ0O3xo0fXSQlbxoUF+CGyzk18cRzd5T9Z7k35 IbX+pCZGxQ5Dc8bJJm9tPOBs04a+N3YL9iRgilz7GmaOANywqWnlSf0xlECVODIuovhN 5B/A== Received: by 10.100.227.1 with SMTP id z1mr4815841ang.86.1332237008724; Tue, 20 Mar 2012 02:50:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.106.7 with HTTP; Tue, 20 Mar 2012 02:49:48 -0700 (PDT) In-Reply-To: References: From: "Matt Oates (Home)" Date: Tue, 20 Mar 2012 09:49:48 +0000 Message-ID: Subject: Re: Slow start to cope with load To: Ole Tange Content-Type: text/plain; charset=UTF-8 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Mar 2012 09:50:27 -0000 > The --is-threaded will only make sense for CPU limited jobs. I agree, and these are usually the jobs that the program developer added in multi threading already. > So explain in which situations that these would not be equivalent: > > -j 100% --is-threaded=4 > -j 25% The difference is what if I don't know how many CPUs there are on each machine with -S and it's heterogenous and not divisible by four evenly. I'd like to specify any given percentage as a specificity of total CPU use, and then hint to parallel that my job is going to use 4 cores if you schedule a single one. For example 25% of a 6 core machine isn't enough to hold a single 4 core job without going over the 25% allocation I specified. I'm not suggesting that this is a worthwhile feature, just probably an easier one to implement that has valid use. > If I understand you correctly you basically want to ignore the load > average as reported by the server, but instead compute your own, where > you ignore the jobs that are nicer than you are. Not at all, I just think it makes more sense to take into account the ratio of parallel submitted jobs that are in the run/block state to the ready/waiting state. What is the point in issuing more jobs that are CPU bound and waiting, it's adding load with no reward! If the opposite is true why not issue more jobs even starting at high load. I would use this as a weighting for your current equation not as the method of planning how many jobs to issue. > If that is what you mean I see the following problems: > > * It is hard to explain what is going on (thus not adhering to > Principle of Least Astonishment). > * How do you determine what processes will be knocked off the scheduling queue? You don't, you just know its happening if your running/ready ratio is good at high load. This is something thats not hard for parallel to work out, especially for child processes. > * How do you tell that whether the job you are running is limited by > disk I/O or CPU? If its in the running state its not IO limited instantaneously so who cares. The whole job will be IO limited which is more important for balancing load if the ratio of running+waiting to blocked processes is small: (1 running + 3 waiting) / 100 blocked its going to be IO limited. Thats kind of the whole point of the kernel telling you process states. > * How do you tell if the running process is a (detatched) > (grand*)child of a process started by GNU Parallel and that the parent > is just waiting for the child complete? If by detached you mean daemonized with its parent pid as 1? AFAIK you wouldn't ever see something waiting on a daemon unless its done badly. Also wouldn't that utterly break parallel anyway as there is no way to get the stdout back since the processes parallel had a pipe to will have exited, if daemonizing was done properly. If you mean forked rather than detached then walking the process tree and taking an aggregate of all the leaf processes per job is the way to go. > It seems like an awful lot of complexity, but I might be wrong. I agree completely and was pointing out the levels of complexity you would need to go to causing the least surprise given what people do to load balance. My point is an over simplification of the actual problem of load balancing is even more dangerous if people rely on it to do something smart. Already you are causing surprise by farming out 100 jobs if the load is starting out nearly maxed out. To do something thats magical you have to create the magic. I would if anything remove the load feature before making it more complex, or just write in the documentation the limitations of its use and cases where its very useful and others where it's pathalogical. IMHO by adding shallow support for a batch queueing use people are going to just be increasingly annoyed when they shoot themselves in the foot, such as Thomas. Best, Matt. From MAILER-DAEMON Thu Mar 22 08:46:47 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SAhPj-0006RS-2S for mharc-parallel@gnu.org; Thu, 22 Mar 2012 08:46:47 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50962) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAhPg-0006Q6-Er for parallel@gnu.org; Thu, 22 Mar 2012 08:46:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SAhPb-0008Br-2o for parallel@gnu.org; Thu, 22 Mar 2012 08:46:44 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:37025) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAhPO-0008AE-Ga; Thu, 22 Mar 2012 08:46:26 -0400 Received: by ggeq1 with SMTP id q1so1976131gge.0 for ; Thu, 22 Mar 2012 05:46:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=+ziglgHMl4d2hGd6ujMH5DXQ/4wwbWfMJMqNHNTDkRk=; b=CdhTwBG7B1teIO63G+J2ffZk9EnPIEmk/nyFURHK9Q2rPotnulTw7jsf3kC59aoYVu 5BHhXMxUnPm3wl4Y6cm8LE//Qjt8g5Kf+HnXXT3CdXRzewYdkIkS8y7wGPYxHyjR+iYV pNEFjcjtMQawX+TnYYvccnflQxbNaUtWmme1bNMnOTvFgTnMZe0NW1pIChU6KNpsUIfV lWMh16yTw1XC5R2IdH9trJiMB7kZwhs4JUi9v1m2S+EUwZ3P/dUE5qqzZ6LkJQYycQtD RlgHnHz02XOyKUt3LTdsFZfW+xQCNkhv1bv7GwWn2Quyt7rEMEBfTUSAB0eOmh+qe96e v0CQ== Received: by 10.68.230.195 with SMTP id ta3mr20308046pbc.149.1332420382619; Thu, 22 Mar 2012 05:46:22 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.200.11 with HTTP; Thu, 22 Mar 2012 05:46:02 -0700 (PDT) In-Reply-To: <0FCDFFC6-761C-4AB0-AD52-6081AC400B7C@strath.ac.uk> References: <0FCDFFC6-761C-4AB0-AD52-6081AC400B7C@strath.ac.uk> From: Ole Tange Date: Thu, 22 Mar 2012 13:46:02 +0100 X-Google-Sender-Auth: JfKlsL70zEEKFTb94yUgZnnajSA Message-ID: Subject: Re: GNU Parallel Bug Reports Randomising Job Distribution To: Alastair Andrew Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org, "bug-parallel@gnu.org" X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2012 12:46:45 -0000 On Wed, Mar 21, 2012 at 12:58 PM, Alastair Andrew wrote: > Is there any way to tell GNU parallel to randomly assign the jobs to machines rather than just chunking them in sequence? [...] There is no random assignment. > If 4 of the biggest jobs get placed on the same machine they starve each other of resources and completely kill that machine. I assume you do not know in advance which jobs are going to be big. Also you might want to take a look at niceload --io and --start-mem (man niceload). > This is how I'm using parallel: > > parallel -S .. --nice 19 --halt-on-error 0 -j+0 --noswap "solve {1}_{2}.prob" ::: small medium big ::: {1..20} Using the current code you could: parallel echo solve {1}_{2}.prob ::: small medium big ::: {1..20} | shuf | parallel -S .. --nice 19 --halt-on-error 0 > I've set the processes to be nice'd as much as possible On the email list we have discussed improvement to --load. I am currently considering '--load auto' which should do: ncpu = number of cores nrunning = number of processes in R state according to `ps -A -o s` if nrunning == ncpu: do not spawn more processes (all the CPU power is being used - by parallel or others) if any children are disk i/o starved: do not spawn more processes (disk i/o for this dir is probably all used) else: increase the number of processes to run /Ole From MAILER-DAEMON Thu Mar 22 11:11:16 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SAjfY-0001aI-OQ for mharc-parallel@gnu.org; Thu, 22 Mar 2012 11:11:16 -0400 Received: from eggs.gnu.org ([208.118.235.92]:57190) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAjfR-0001Z0-R0 for parallel@gnu.org; Thu, 22 Mar 2012 11:11:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SAjfQ-0008HI-0e for parallel@gnu.org; Thu, 22 Mar 2012 11:11:09 -0400 Received: from mail-ey0-f169.google.com ([209.85.215.169]:45879) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAjfP-0008G5-OP for parallel@gnu.org; Thu, 22 Mar 2012 11:11:07 -0400 Received: by eaal1 with SMTP id l1so748319eaa.0 for ; Thu, 22 Mar 2012 08:11:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=3EcPBvw6SxgLFbfzl/DKtcY4oKPf08BOcYs2Gl0ZU+Q=; b=xgBi+BCrilP841oosTpWAc+jNuBg7kxNFQb1P3c0VIv1QlTkQ3yozUE6M4TqxNKU99 621Kj2SscndZ+mF6cwan8fQJdjM/Nl/BsvQH/syCG0tV8eDoWYIisr6ncXUa8McA7pSd bLZsbqKP404tqCwmqENelEzC5LU+GUWnKlYcVG7te/+2+rUMmzxm6cidU9MhTJPWdecd lCKdDB0hzlw0WrD60pawcLNBJp921GtzvKGYpG+o8ySsADLR0GqydjkP54awRAgoGU8Q AM+/7l6+bImcGkBL6BCEABFByBeO4AKfx3jexBlpe2IQ/A3xef9QhLVb0mNEpcoEb3LS 3SKQ== MIME-Version: 1.0 Received: by 10.50.47.135 with SMTP id d7mr1934789ign.66.1332429064246; Thu, 22 Mar 2012 08:11:04 -0700 (PDT) Received: by 10.50.191.229 with HTTP; Thu, 22 Mar 2012 08:11:04 -0700 (PDT) In-Reply-To: References: Date: Thu, 22 Mar 2012 11:11:04 -0400 Message-ID: Subject: Re: Feature request - {..} remove two level of extensions From: Jay Hacker To: parallel@gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.215.169 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2012 15:11:15 -0000 I would also like this feature. On Fri, Mar 16, 2012 at 4:00 PM, Martin M=C3=B8ller Skarbiniks Pedersen wrote: > Hi, > =C2=A0I have a feature request for GNU parallel. > =C2=A0The option could be {..} or maybe {.2} and should > remove the extension from a filename two times eg. > > $ ls track07.cdda.wav | parallel echo {..}.flac > track07.flac > > I could use with when I extract files from a music disc with > cdparanoia and then encode the wav-files with flac. > > Today I do this: > > $ ls track*wav > track07.cdda.wav =C2=A0track10.cdda.wav =C2=A0track13.cdda.wav > track08.cdda.wav =C2=A0track11.cdda.wav =C2=A0track14.cdda.wav > track09.cdda.wav =C2=A0track12.cdda.wav =C2=A0track15.cdda.wav > > $ ls *wav | parallel -n 1 flac {} -o {.}.flac > [...] > > $ ls track*flac > track07.cdda.flac =C2=A0track10.cdda.flac =C2=A0track13.cdda.flac > track08.cdda.flac =C2=A0track11.cdda.flac =C2=A0track14.cdda.flac > track09.cdda.flac =C2=A0track12.cdda.flac =C2=A0track15.cdda.flac > > $ ls *flac | parallel -n 1 mv {} {.} > > $ ls *cdda | parallel -n 1 mv {} {.}.flac > > > Regards > Martin > > > -- > Til uvedkommende, der l=C3=A6ser med: Der er ingen grund til at l=C3=A6se= min > mail. Jeg har intet at g=C3=B8re med FARC, al-Jihad, al-Qaida, Hamas, Hiz= b > al-Mujahidin eller ETA. Jeg har aldrig gjort Zakat, g=C3=A5r ikke ind for > Istishad, har ikke lavet en bilbombe eller kernev=C3=A5ben og jeg ved > d=C3=A5rligt nok, hvad Al Manar og =D0=B1=D0=BE=D0=BC=D0=B1=D0=B0 betyder= . =C2=A0Men tak for den udviste > interesse. > > Leve Ligemageriet! > Styrk p=C3=B8belv=C3=A6ldet! > Bevar misundelsesafgifterne og cafepengene! > Hurra for =C3=A6ldrebyrden! > From MAILER-DAEMON Thu Mar 22 11:48:25 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SAkFV-00018y-6s for mharc-parallel@gnu.org; Thu, 22 Mar 2012 11:48:25 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55421) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAkFS-000185-1P for parallel@gnu.org; Thu, 22 Mar 2012 11:48:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SAkFM-0001ZD-Ob for parallel@gnu.org; Thu, 22 Mar 2012 11:48:21 -0400 Received: from mail-we0-f169.google.com ([74.125.82.169]:63219) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAkFM-0001Xe-Fy; Thu, 22 Mar 2012 11:48:16 -0400 Received: by werj55 with SMTP id j55so2515410wer.0 for ; Thu, 22 Mar 2012 08:48:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=Boh9J8hS25EYayVy/mJfzcTdFe6REOGJG0TFLw6GRjY=; b=ugzFsqIafCxvZS6IGoqGYI+C91Dgxnd2ykbKRMEZE2KBxEbkqS8brWWRcsBxl/tzjJ UCge3d6RUZvtFVra9BuOqBvKt25Kmygf3+iQEzO0gIp3bKbK2vKFI7js3H410BylKVq5 k43gciS4gy61Y7k5BCMG4foFo02KeLm3y/2UJKsL3iXOeVjk8tYJgRcKL1fs4uJJHWl8 5s/Bbg0NVU2f2BrgxzMhjd3j/ViWt/XFX4m/F9jMaX/AFIqzW4lzvjKaxYGMQ3hGwBvi fRu/Zcuf5gcO+S+q6tYW3BWAJmDpki4Y/qoOxDLyRQaTJkAWg6hyMA2UmoN9Y+h+RYNK Vqdw== MIME-Version: 1.0 Received: by 10.50.222.131 with SMTP id qm3mr2176423igc.66.1332431293109; Thu, 22 Mar 2012 08:48:13 -0700 (PDT) Received: by 10.50.191.229 with HTTP; Thu, 22 Mar 2012 08:48:12 -0700 (PDT) In-Reply-To: References: Date: Thu, 22 Mar 2012 11:48:12 -0400 Message-ID: Subject: Re: Slow start to cope with load From: Jay Hacker To: Ole Tange Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 74.125.82.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2012 15:48:23 -0000 Perhaps this is a bit simplistic, but what if you took your idea and also kept a running estimate of the amount of load added by each job? Start out assuming each job adds 1 unit of load, and then measure: "Okay, I started 4 jobs last time, and the load went up by 8, so I estimate each job causes 2 units of load." Then when you sample the difference and current load is say 12, with 16 procs, you'll only add 2 jobs, and the load doesn't go over the max. I'm not sure exactly how to calculate it, but a first stab might be load_per_job =3D current_load / job_slots, and then job_slots +=3D (desired_load - current_load) / load_per_job. Really you probably want a moving average. But something like that could let you learn how your jobs affect the system. -John On Thu, Mar 15, 2012 at 8:32 PM, Ole Tange wrote: > Thomas got me thinking. > > One of the problems with --load is that it only limits how many jobs > are started. So you may start way too many. This will give you a load > of 100: > > =A0seq 100 | nice parallel -j0 --load 2.00 burnP6 > > and that is most likely not what you want. > > While some programs run multiple threads (and thus can give a load > 1 > each) that is the exception. So in general I think we can assume one > job will at most give a load of 1. > > Currently load is only computed every 10 seconds. So we could > recompute every 10 seconds: > > =A0 =A0number_of_concurrent_jobs =3D max_load - current_load + > number_of_concurrent_jobs > > If the job immediately takes 100% CPU time (like burnP6) then the > number of processes will grow every 10 seconds with the difference > between current load and max load. As the load lags behind it may > cause us to spawn too many processes that will cause a load > max > load. But when the jobs finish the the load will over time drop to the > max load. > > If the job never takes 100% CPU time (like host) then the number of > processes will grow every 10 seconds with the difference between > current load and max load. > > If the job takes 100% CPU time after some initialization (like blast) > then the number of processes will grow every 10 seconds with the > difference between current load and max load. The current load will > start out small, this may cause us to spawn too many processes that > will cause a load > max load. > > If the job takes >100% CPU time after some initialization (like > multithreaded blast) then the number of processes will grow every 10 > seconds with the difference between current load and max load. The > current load will start out small, this may cause us to spawn too many > processes that will cause a load > max load. > > I believe it would be better than the current, but I am very open to > even better ideas. > > > /Ole > From MAILER-DAEMON Thu Mar 22 12:21:26 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SAklS-0003X6-85 for mharc-parallel@gnu.org; Thu, 22 Mar 2012 12:21:26 -0400 Received: from eggs.gnu.org ([208.118.235.92]:51273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAklO-0003Vq-Tq for parallel@gnu.org; Thu, 22 Mar 2012 12:21:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SAklH-00015i-Kk for parallel@gnu.org; Thu, 22 Mar 2012 12:21:22 -0400 Received: from mail-pb0-f41.google.com ([209.85.160.41]:35639) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SAklH-00015e-BY for parallel@gnu.org; Thu, 22 Mar 2012 12:21:15 -0400 Received: by pbcup15 with SMTP id up15so1891048pbc.0 for ; Thu, 22 Mar 2012 09:21:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=TO3nBE5zAxdhX1I93Xu+q2Iy4kKu/QOsl4yy2sKGQL0=; b=b/envlcrmU2KY+96eAW+IKzX5JulXcNbLau0Kpc/EYejRcMPc1V8qSe6xe0+mQXw1c qAM6/DbNqMvCtbdAepcmEuEmtyr5bZtYMR4IgRZ2zvS5Iv0EcTL4yNCJq8h3PP9per5Q 4sTH4ZFFlzOSKWI2tSDJwAgKrasNMPGcLJEGPV8Q3f2HrkxwGpeDvhGyMAdxND29fUe+ i8/CBOwxtLqIQPEsnVH/aZj48egg5TZuxje/kZe236h5lC+9C3kmk8isEtNBg2wn4T/7 MR/N8zJxoqAUXgfWuCjJz1FR6HhOw+EYIIVA9A0G7HZeUib0IXlqGySNPabAKk8g9FWz l/fA== Received: by 10.68.74.97 with SMTP id s1mr21659101pbv.46.1332433273302; Thu, 22 Mar 2012 09:21:13 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.200.11 with HTTP; Thu, 22 Mar 2012 09:20:52 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Thu, 22 Mar 2012 17:20:52 +0100 X-Google-Sender-Auth: n9HTH-kFWYPXAQjxU8SIzOhsTLE Message-ID: Subject: Re: Slow start to cope with load To: Jay Hacker Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2012 16:21:24 -0000 On Thu, Mar 22, 2012 at 4:48 PM, Jay Hacker wrote: > Perhaps this is a bit simplistic, but what if you took your idea and > also kept a running estimate of the amount of load added by each job? > Start out assuming each job adds 1 unit of load, and then measure: > "Okay, I started 4 jobs last time, and the load went up by 8, so I > estimate each job causes 2 units of load." =A0Then when you sample the > difference and current load is say 12, with 16 procs, you'll only add > 2 jobs, and the load doesn't go over the max. That would only work on dedicated single user systems. My servers are (ab)used by 3-5 people at the same time. But I am warming up to the idea of ignoring load and instead just look at 'ps -A -o s'. 1: If number of 'R' =3D=3D number of cpus: Do not start another. 2: If number of 'D' amongst (grand)children >=3D 1: Do not start another. 3: Else start a job more. CPU limited tasks will be limited by rule 1. Disk and NFS I/O limited tasks will be limited by rule 2. Net I/O will not be limited. I have not tested what will happen if the machine is swapping. /Ole From MAILER-DAEMON Thu Mar 22 17:46:52 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SApqO-0001hW-MK for mharc-parallel@gnu.org; Thu, 22 Mar 2012 17:46:52 -0400 Received: from eggs.gnu.org ([208.118.235.92]:34355) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SApqL-0001fP-RV for parallel@gnu.org; Thu, 22 Mar 2012 17:46:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SApqJ-0002rF-Qg for parallel@gnu.org; Thu, 22 Mar 2012 17:46:49 -0400 Received: from mail-gy0-f169.google.com ([209.85.160.169]:36309) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SApqG-0002qX-7E; Thu, 22 Mar 2012 17:46:44 -0400 Received: by ghrr18 with SMTP id r18so2665534ghr.0 for ; Thu, 22 Mar 2012 14:46:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=MgyVaFX2kaDxCpttT3jx80k47wPsVffDKN3zizcb64s=; b=bopo0gNONik7ckomqVmgI/uj4dztXqKC9PHw7tG6CiECfMy7Anxyr6+m5Js0eZNCiF JWke1ml8BFbvYk7OTOI8nhUvuhidNSJjJh5fVvyMDsmxcHhBQdP2uIt7L2nvC/PSXGM6 snSG65R+MHxgQXUgEXfDyDi/uLE62gx3LI83R48E7k9BrpBUw+ef43KABnd69hK4Y+hQ UgAL24gnBuWxwcGy5vYj2qL92PffUMYHQK2M/THfbxflIOvwEV91it51QqRbxP4AYIBJ 4YHEFyiAmG7UruluGlXYxPA047wkVPl5Hz9nv4HX027rPQ+ZRI1QvK8CfoD5BZ6gurQE E+Kg== Received: by 10.68.226.225 with SMTP id rv1mr1452126pbc.149.1332452801504; Thu, 22 Mar 2012 14:46:41 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.200.11 with HTTP; Thu, 22 Mar 2012 14:46:21 -0700 (PDT) From: Ole Tange Date: Thu, 22 Mar 2012 22:46:21 +0100 X-Google-Sender-Auth: q8gZxOUC7e1kf_vqVqL5WnWAgok Message-ID: Subject: GNU Parallel 20120322 ('#518696') released To: parallel@gnu.org, bug-parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.169 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2012 21:46:51 -0000 GNU Parallel 20120322 ('#518696') has been released. It is available for download at: http://ftp.gnu.org/gnu/parallel/ This is a bugfix release with no new features. Probably a good release for stable long-term use. New in this release: * Parallel Process Database Dumps. http://blog.mattoates.co.uk/2012/02/parallel-process-database-dumps.html * Using GNU Parallel to process images from Mars. http://lunokhod.org/?p=468 * Using GNU Parallel with bzgrep. http://filip.rembialkowski.net/did-you-know-gnu-parallel/ * Bug fixes and man page updates. = About GNU Parallel = GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job is can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel. If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops. GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs. You can find more about GNU Parallel at: http://www.gnu.org/s/parallel/ Watch the intro video on http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 or at http://tinyogg.com/watch/TORaR/ and http://tinyogg.com/watch/hfxKj/ When using GNU Parallel for a publication please cite: O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47. = About GNU SQL = GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries. The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell. When using GNU SQL for a publication please cite: O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32. = About GNU Niceload = GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit. From MAILER-DAEMON Fri Mar 23 17:20:11 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SBBu7-0004un-JD for mharc-parallel@gnu.org; Fri, 23 Mar 2012 17:20:11 -0400 Received: from eggs.gnu.org ([208.118.235.92]:51531) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBBu4-0004tA-Qn for parallel@gnu.org; Fri, 23 Mar 2012 17:20:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SBBu2-0000N2-Qf for parallel@gnu.org; Fri, 23 Mar 2012 17:20:08 -0400 Received: from mail-iy0-f169.google.com ([209.85.210.169]:61401) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBBu2-0000It-Jr; Fri, 23 Mar 2012 17:20:06 -0400 Received: by iajr24 with SMTP id r24so6371166iaj.0 for ; Fri, 23 Mar 2012 14:20:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=lkvbl52zky1ZD1/84/mH6rvqdS6CdDa65M169/NMvTg=; b=bX2mlij4TsnXoQI3gEPtn4Sh0IIyuQ0v5gtGxA023Oza7yX51v7bNQZE4xUtlUg+SJ BCMWh/lRimeAdOeKWoHsLOzE690IwHO9heXJFirqjv+gze7WztveEb0860+0kDhjeRgi kwPXSbp7TrJaTV1wbm1EaF70W79zfEogOTdlZTot+hrRAFjoAMlcQVgO13h+Uch4PqUt y5H/Q70+6AakoaYx6MfRXVKbH00zm1MqZmsW3hsGDmScC8+0cUsL6ggKoCVF8OxIo++Q dn009bPAOGt6v2mTg1nNPCYkjno8qDLvIglWo4ztBL+Zb9TZc7twP47MNv3jAO3u3Ui9 eQWg== MIME-Version: 1.0 Received: by 10.50.85.232 with SMTP id k8mr94992igz.16.1332537603295; Fri, 23 Mar 2012 14:20:03 -0700 (PDT) Received: by 10.50.191.229 with HTTP; Fri, 23 Mar 2012 14:20:03 -0700 (PDT) In-Reply-To: References: Date: Fri, 23 Mar 2012 17:20:03 -0400 Message-ID: Subject: Re: Slow start to cope with load From: Jay Hacker To: Ole Tange Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.210.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Mar 2012 21:20:10 -0000 I like that idea, since ps gives an instantaneous count instead of an average. I don't know much about process states, but some brief playing around reveals some issues: 1. It's not accurate. Running `pbzip2 bigfile.txt` while watching `ps -u $USER -o s,pcpu,args` often shows state S (sleeping) even when using 1600% CPU. 2. It doesn't account for multithreaded programs. Running pbzip2 at 1600% CPU shows only one R (running). Fortunately, ps -L seems to fix this, while also helping #1. 3. If I have more than one disk or network adapter, I can usefully have more than one process in the 'D' (I/O) state. This seems tough to get right automatically; perhaps a separate "--ioload" option is easiest. If the machine is swapping, the user can fix that with --noswap. But swapping is really running out of memory, which is probably best addressed with an orthogonal "--memload" or "--min-free-mem" option. Having something like parallel --load 100% --ioload 4 --min-free-mem 2G would be awesome: only start new jobs if there are < $num_cpus threads in the R state, < 4 in the D state, and > 2GB free memory. That way I can have lots of processes with highly variable workloads all doing their own thing and not stepping on each other. (I can see both --memload and --min-free-mem being useful, the first for "I need to reserve 4G system RAM for other stuff," the second for "Each job needs 2G RAM." The second is probably more common.) On Thu, Mar 22, 2012 at 12:20 PM, Ole Tange wrote: > On Thu, Mar 22, 2012 at 4:48 PM, Jay Hacker wrote: > >> Perhaps this is a bit simplistic, but what if you took your idea and >> also kept a running estimate of the amount of load added by each job? >> Start out assuming each job adds 1 unit of load, and then measure: >> "Okay, I started 4 jobs last time, and the load went up by 8, so I >> estimate each job causes 2 units of load." =A0Then when you sample the >> difference and current load is say 12, with 16 procs, you'll only add >> 2 jobs, and the load doesn't go over the max. > > That would only work on dedicated single user systems. > > My servers are (ab)used by 3-5 people at the same time. > > But I am warming up to the idea of ignoring load and instead just look > at 'ps -A -o s'. > > 1: If number of 'R' =3D=3D number of cpus: Do not start another. > 2: If number of 'D' amongst (grand)children >=3D 1: Do not start another. > 3: Else start a job more. > > CPU limited tasks will be limited by rule 1. > Disk and NFS I/O limited tasks will be limited by rule 2. > Net I/O will not be limited. > > I have not tested what will happen if the machine is swapping. > > > /Ole From MAILER-DAEMON Mon Mar 26 05:48:15 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SC6X9-0008VF-Ay for mharc-parallel@gnu.org; Mon, 26 Mar 2012 05:48:15 -0400 Received: from eggs.gnu.org ([208.118.235.92]:37168) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC6X1-0008U1-RN for parallel@gnu.org; Mon, 26 Mar 2012 05:48:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SC6Wv-0005X8-MJ for parallel@gnu.org; Mon, 26 Mar 2012 05:48:07 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:34369 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC6Wv-0005Pz-GH for parallel@gnu.org; Mon, 26 Mar 2012 05:48:01 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SC6Ws-0000Fb-C7 for parallel@gnu.org; Mon, 26 Mar 2012 11:47:58 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SC6Wp-0005P2-9P for parallel@gnu.org; Mon, 26 Mar 2012 11:47:55 +0200 Message-ID: <4F703B4A.1000509@med.uni-frankfurt.de> Date: Mon, 26 Mar 2012 11:47:54 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: parallel stops working for no obvious reason X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2012 09:48:13 -0000 Hi there ... I run parallel to distribute ImageMagick on a 32 Core machine. At first 'convert' failed and google suggested not to use debian's pre-compiled binary: libgomp: Thread creation failed: Resource temporarily unavailable open3: fork failed: Resource temporarily unavailable at /usr/bin/parallel line 3587 After compiling ImageMagick from source (without libgomp), it seemed OK, but parallel eventually stopped starting new jobs. This happend last week with v20120222, I'll give it another try with v20120322 today. In case you want to reproduce it, here is how: $ time parallel -tj256 mk_gradient {1}{2}{3}{4}{5}{6} ::: \ > {0..9} {A..F} ::: {0..9} {A..F} ::: {0..9} {A..F} ::: \ > {0..9} {A..F} ::: {0..9} {A..F} ::: {0..9} {A..F} 2>&1 | nl with mk_gradient: #!/bin/bash [ -s "$1.jpg" ] && exit 0 exec convert -size 1600x1200 gradient:#$1 -gravity Center \ -pointsize 100 -draw "text 0,0 '#$1'" $1.jpg While PARALLEL was set to --load 100% --nice 10 --noswap --workdir /scratch and the system was completely idle. Please keep in mind that parallel will run several hours before hitting the issue and that several GB of files will be created. Thomas From MAILER-DAEMON Mon Mar 26 07:53:23 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SC8UF-0007po-U0 for mharc-parallel@gnu.org; Mon, 26 Mar 2012 07:53:23 -0400 Received: from eggs.gnu.org ([208.118.235.92]:52061) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC8UE-0007pa-1z for parallel@gnu.org; Mon, 26 Mar 2012 07:53:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SC8UC-0000a3-Fq for parallel@gnu.org; Mon, 26 Mar 2012 07:53:21 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:46482 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC8UC-0000ZY-9S for parallel@gnu.org; Mon, 26 Mar 2012 07:53:20 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SC8U9-0002k0-NR for parallel@gnu.org; Mon, 26 Mar 2012 13:53:17 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SC8U6-0002GX-DF for parallel@gnu.org; Mon, 26 Mar 2012 13:53:14 +0200 Message-ID: <4F7058AB.70005@med.uni-frankfurt.de> Date: Mon, 26 Mar 2012 13:53:15 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: BUG: swap_activity broken on Debian in v20120322 X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2012 11:53:23 -0000 Hi there ... "swap_activity" is broken on Debian in v20120322: $ date | parallel-20120322 --noswap echo mv: cannot stat `/home/tsattler/.parallel/tmp/swap_activity-5812-:5812': No such file or directory mv: cannot stat `/home/tsattler/.parallel/tmp/swap_activity-5812-:5812': No such file or directory mv: cannot stat `/home/tsattler/.parallel/tmp/swap_activity-5812-:5812': No such file or directory [Ctrl-C] $ sh: vm_stat: not found After reverting the "remote MAC"-stuff introduced between v20120222 and v20120322, it's now working again. Thomas From MAILER-DAEMON Mon Mar 26 09:22:58 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SC9sw-0001Q8-Gi for mharc-parallel@gnu.org; Mon, 26 Mar 2012 09:22:58 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC9sr-0001PZ-CG for parallel@gnu.org; Mon, 26 Mar 2012 09:22:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SC9sp-0004SN-CE for parallel@gnu.org; Mon, 26 Mar 2012 09:22:52 -0400 Received: from mail-vx0-f169.google.com ([209.85.220.169]:33716) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SC9sp-0004Rm-4T for parallel@gnu.org; Mon, 26 Mar 2012 09:22:51 -0400 Received: by vcbfk14 with SMTP id fk14so5414998vcb.0 for ; Mon, 26 Mar 2012 06:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PG0bnHbHRmcY//o4ymQ0iPITSN7GMWPGzVFilJ8FqxQ=; b=ARVsHi3vxxM3T94BriZ1LdImIgW1n5Ze4/4BggJHBd68XwXp2vvNj9f0Sr80uKudUR XgqnNTtMzZje5JwVq3ZcoYRNlkgAlrd7OIm+Xnl50vxwHTy2MetiOZULScFDjqVc8XQo zT8nEqOJhX62UTG1gv3kFuSWbMZ9Q7sqw8aN8Rn6s4EUrU27dvC3ClK0rR0zuRGx17lS RC2xRyF6xrYs0reVBx/jN9VLxvTGR473Lxum12ax4nFgQYv8dUasgpccBohCWmWAB/vA jWTMn3ao3yJ9Vjyvf1fGWkSggReSXpx0nabq5S7IjLjeH9gu91bGX+2x0fVubcG1nwXd D4GQ== MIME-Version: 1.0 Received: by 10.52.21.137 with SMTP id v9mr655506vde.61.1332768168369; Mon, 26 Mar 2012 06:22:48 -0700 (PDT) Received: by 10.52.156.45 with HTTP; Mon, 26 Mar 2012 06:22:48 -0700 (PDT) In-Reply-To: <4F7058AB.70005@med.uni-frankfurt.de> References: <4F7058AB.70005@med.uni-frankfurt.de> Date: Mon, 26 Mar 2012 15:22:48 +0200 Message-ID: Subject: Re: BUG: swap_activity broken on Debian in v20120322 From: =?ISO-8859-1?Q?Martin_M=F8ller_Skarbiniks_Pedersen?= To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.220.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2012 13:22:58 -0000 On 26 March 2012 13:53, Thomas Sattler wrote: > Hi there ... > > "swap_activity" is broken on Debian in v20120322: > Same problem on a older parallel: $ lsb_release -d Description: Ubuntu 10.04.4 LTS $ parallel --version | head -1 GNU parallel 20110822 $ seq 1 4 | parallel --noswap echo sh: cannot create /home/mm/.parallel/tmp/swap_activity-21336-:21336: Directory nonexistent mv: cannot stat `/home/mm/.parallel/tmp/swap_activity-21336-:21336': No such file or directory sh: cannot create /home/mm/.parallel/tmp/swap_activity-21336-:21336: Directory nonexistent mv: cannot stat `/home/mm/.parallel/tmp/swap_activity-21336-:21336': No such file or directory [...] From MAILER-DAEMON Mon Mar 26 11:30:52 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SCBsi-0001vZ-7o for mharc-parallel@gnu.org; Mon, 26 Mar 2012 11:30:52 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56041) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCBsb-0001tt-FL for parallel@gnu.org; Mon, 26 Mar 2012 11:30:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SCBsS-0005BG-Pn for parallel@gnu.org; Mon, 26 Mar 2012 11:30:45 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:32922 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCBsS-0005B6-KI for parallel@gnu.org; Mon, 26 Mar 2012 11:30:36 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SCBsQ-0002V3-CD for parallel@gnu.org; Mon, 26 Mar 2012 17:30:34 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SCBsN-0003kV-Iu for parallel@gnu.org; Mon, 26 Mar 2012 17:30:31 +0200 Message-ID: <4F708B97.4040209@med.uni-frankfurt.de> Date: Mon, 26 Mar 2012 17:30:31 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: Re: parallel stops working for no obvious reason References: <4F703B4A.1000509@med.uni-frankfurt.de> In-Reply-To: <4F703B4A.1000509@med.uni-frankfurt.de> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2012 15:30:51 -0000 > This happend last week with v20120222, I'll give it another > try with v20120322 today. Might be fixed in 20120322: the job is still running and >500000 files have been created. At least, 201202222 stopped earlier. Thomas From MAILER-DAEMON Tue Mar 27 05:13:07 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SCSSh-0001bd-TO for mharc-parallel@gnu.org; Tue, 27 Mar 2012 05:13:07 -0400 Received: from eggs.gnu.org ([208.118.235.92]:58959) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCSSb-0001LE-4c for parallel@gnu.org; Tue, 27 Mar 2012 05:13:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SCSSU-0002T4-Qc for parallel@gnu.org; Tue, 27 Mar 2012 05:13:00 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:58427 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCSSU-0002RV-KT for parallel@gnu.org; Tue, 27 Mar 2012 05:12:54 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SCSSF-0005g7-UC for parallel@gnu.org; Tue, 27 Mar 2012 11:12:40 +0200 Received: from p54b0bbdd.dip.t-dialin.net ([84.176.187.221] helo=[192.168.2.17]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SCSSD-0007hk-Ux for parallel@gnu.org; Tue, 27 Mar 2012 11:12:38 +0200 Message-ID: <4F718484.1070602@med.uni-frankfurt.de> Date: Tue, 27 Mar 2012 11:12:36 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: Re: parallel stops working for no obvious reason References: <4F703B4A.1000509@med.uni-frankfurt.de> <4F708B97.4040209@med.uni-frankfurt.de> In-Reply-To: <4F708B97.4040209@med.uni-frankfurt.de> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Mar 2012 09:13:06 -0000 Am 26.03.2012 17:30, schrieb Thomas Sattler: >> This happend last week with v20120222, I'll give it another >> try with v20120322 today. > > Might be fixed in 20120322: the job is still running and >500000 > files have been created. At least, 201202222 stopped earlier. No real news: 'parallel' is still running. Nearly 1,600,000 jobs run so far. 201202222 definitely stopped before 700,000. I'll keep it running until everything is done, but I'll only reply to this mail, if another error occurs. Thomas From MAILER-DAEMON Tue Mar 27 05:31:18 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SCSkI-000563-Fl for mharc-parallel@gnu.org; Tue, 27 Mar 2012 05:31:18 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50346) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCSk6-00055j-IE for parallel@gnu.org; Tue, 27 Mar 2012 05:31:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SCSk3-0007P3-S6 for parallel@gnu.org; Tue, 27 Mar 2012 05:31:06 -0400 Received: from mail-pb0-f41.google.com ([209.85.160.41]:42238) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SCSk3-0007On-K0 for parallel@gnu.org; Tue, 27 Mar 2012 05:31:03 -0400 Received: by pbcup15 with SMTP id up15so11268pbc.0 for ; Tue, 27 Mar 2012 02:31:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=ExwY16BRfnBRW/0BwcFLJtA6YsJLITupWlFRqKyjCnQ=; b=d+mWsSAX4EQvjTVue0wvkJ/gxiwWLve4dlvqzOFLpXff3q1XHzTdY2or4YA+LMV2OJ 3LtitfTWtSeVji1LN4MtE4J0EWGvPjOJK+gk8J994fqmVrgyOd9vxfks11wdUW1+MoT0 OuJYuQBS+b91rbaQy3lXW9xtNmuzlhp8AjmE+JMoK9ycb33hkZVUbD4EQb2tG55EcaNy MQQX7vlM9Hg3lDvwi/1RX1WifhBtflkXC6G4zjHDEkjvWXUvYiinW4j5j4Qr0uFK65yZ oIsJkBoGNXvH9FYziPPbB7C65rhv2ZfbMmLKNpz8GfetbBZRv5cT9l2x+v+vV/HK470E /pdg== Received: by 10.68.220.65 with SMTP id pu1mr57340991pbc.32.1332840660541; Tue, 27 Mar 2012 02:31:00 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.142.200.11 with HTTP; Tue, 27 Mar 2012 02:30:40 -0700 (PDT) In-Reply-To: <4F703B4A.1000509@med.uni-frankfurt.de> References: <4F703B4A.1000509@med.uni-frankfurt.de> From: Ole Tange Date: Tue, 27 Mar 2012 11:30:40 +0200 X-Google-Sender-Auth: Hbst9Ki9ma8mqS_M-tfD9Z_oCBU Message-ID: Subject: Re: parallel stops working for no obvious reason To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Mar 2012 09:31:17 -0000 On Mon, Mar 26, 2012 at 11:47 AM, Thomas Sattler wrote: > I run parallel > > =A0libgomp: Thread creation failed: Resource temporarily unavailable > =A0open3: fork failed: Resource temporarily unavailable at > /usr/bin/parallel line 3587 > > This happend last week with v20120222, I'll give it another > try with v20120322 today. One of the fixes in 20120222 was reserving 1 or 2 extra filehandles. This might be what you are seeing. > In case you want to reproduce it, here is how: > > [...long description requiring running 700.000 jobs on 32 cores...] As you probably can imagine that is hard to reproduce. See if you can make smaller example fail - preferably something that can run on smaller machines. /Ole From MAILER-DAEMON Thu Mar 29 15:23:46 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDKwk-0006jB-5o for mharc-parallel@gnu.org; Thu, 29 Mar 2012 15:23:46 -0400 Received: from eggs.gnu.org ([208.118.235.92]:43331) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDKwh-0006iM-IK for parallel@gnu.org; Thu, 29 Mar 2012 15:23:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDKwg-0002Zx-1B for parallel@gnu.org; Thu, 29 Mar 2012 15:23:43 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:41390) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDKwf-0002Zc-Oa for parallel@gnu.org; Thu, 29 Mar 2012 15:23:41 -0400 Received: by wibhj13 with SMTP id hj13so317184wib.12 for ; Thu, 29 Mar 2012 12:23:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=rytolyoDI+xkRoo7pxs4ZXvsotzoVf+03P4qab6Qico=; b=vYK+fsil/2WMjmngCjSY1Mz/3q+boNMjCX/YbGqYgd57Iao7BS7a1IzOlGO5S2wVfW SiY4RtSYZ0u1n8dFUHl4ZB/1ryZioJU6Fvv28V+lpG7aM1ZKWu00KZDkUlhphRB8AlwH L78DNaahS+CfjrsXjeUiRRshA+w3dKVRl3tmcr+2V2e6+mNPcB4USe7DmlxZi3pi6Alv jxRy01vNq5oBXxamoNRElrwsfuzcLXYvpsilNH3Aclc+FjzapPOOBCTdVX2RPs5IID/o 4JwP5/uJYzXURVefvCdvMcV2UWAnkSSZILLqUCHBvRG31sviTzOlnS/Heo/wh+LitXSQ 39Vw== MIME-Version: 1.0 Received: by 10.180.102.101 with SMTP id fn5mr8531966wib.6.1333049019191; Thu, 29 Mar 2012 12:23:39 -0700 (PDT) Received: by 10.223.105.210 with HTTP; Thu, 29 Mar 2012 12:23:39 -0700 (PDT) Date: Thu, 29 Mar 2012 12:23:39 -0700 Message-ID: Subject: Pipe status codes From: David Erickson To: parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.212.177 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Mar 2012 19:23:44 -0000 Hi there- I just discovered parallel and it looks great. One existing use case I have for commands I would like to run is something like: parallel 'cmd1 {} 2>&1 | tee somewhere.log' I have a couple questions regarding this use case: 1) Will parallel correctly honor my request to redirect stderr to stdout? 2) Is there some way to use --halt with the error code returned from cmd1? IE in bash I can do this with ${PIPESTATUS[0]}, because tee will always return 0. Alternatively, can parallel log both stdout and stderr for each command to a unique file with naming of my choice? Thanks! David From MAILER-DAEMON Thu Mar 29 19:54:55 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDPB8-00027c-Vq for mharc-parallel@gnu.org; Thu, 29 Mar 2012 19:54:54 -0400 Received: from eggs.gnu.org ([208.118.235.92]:47328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDPB6-00026V-L6 for parallel@gnu.org; Thu, 29 Mar 2012 19:54:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDPB4-0001zc-Nj for parallel@gnu.org; Thu, 29 Mar 2012 19:54:52 -0400 Received: from mail-ob0-f169.google.com ([209.85.214.169]:49458) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDPB4-0001zP-Fv for parallel@gnu.org; Thu, 29 Mar 2012 19:54:50 -0400 Received: by obbta14 with SMTP id ta14so154641obb.0 for ; Thu, 29 Mar 2012 16:54:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=ekmEV65e7HavH2R6J11c+Fpxb+R/yRJ1vWhL4SdgDNs=; b=ngVEzMLMz9elSspASktUw0lTSRCjrhDVdqnq5XdFvJ87v0tncYtITV+Ws+3QURnQJ1 WolK5BX5f4oliVhd2JnxYKGwZgsqpphDD8W/GSeQgzYktIyvoo8KqDCw3fJ/LsSTIW8G zd+/2yBW4JwJeRPn1ATbDCebndTucwyEsT19rVg9r39NqFoeBWUm5xH6CQzqDwP2fmhg q0Hdd1bzvMCCFBQGPvPRnAfDTy2q5JL3DpO/WKFrnn/Ioa/j2JcVQRgoJp8IxmAiEn+e axhAJJh14TmNROYXjgJo814YcW0c4imgCiKxoapybLiKoVx9z59/aTwmeFUoddh0JX78 sKnw== MIME-Version: 1.0 Received: by 10.60.4.1 with SMTP id g1mr78721oeg.55.1333065287768; Thu, 29 Mar 2012 16:54:47 -0700 (PDT) Sender: owen.solberg@gmail.com Received: by 10.182.44.37 with HTTP; Thu, 29 Mar 2012 16:54:47 -0700 (PDT) Date: Thu, 29 Mar 2012 16:54:47 -0700 X-Google-Sender-Auth: AgxiRh1GZurmE0q7njejE8yOXGU Message-ID: Subject: unexpected behavior when using GNU parallel with block and recstart to break up fasta file From: juncus@gmail.com To: parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.214.169 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Mar 2012 23:54:54 -0000 Hello, I don't need to say how great GNU parallel is (GREAT!). But for the first time, I have encountered a behavior I didn't expect from it. I am trying to break up a big input FASTA file (DNA sequence) using the --block and --recstart options. But it always seems to create ONE more file than I really want. I mean, if I have specified 10 jobs (-j 10), and if the block size on the 10th job is still below my specification (--block 1200), why does it make an 11th file? This means that 10 jobs in parallel run, and then 1 MORE job has to run to get the last record. >From the man page, for the section on --pipe: "...The block read will have the final partial record removed before the block is passed on to the job. The partial record will be prepended to next block." I *think* I understand why it considers the last record to be partial -- is it because I haven't given it a --recend so it doesn't actually KNOW that the last record is the last record ?? -- but I am not sure how to specify --recend for a FASTA file. Can anyone help? Is this possibly a bug? Many thanks in advance, Owen This single line provides a reproducible example (a little weird to use tee this way, but it's how I am tracking how the blocks are working): $ seq 1000 | sed 's/^/>header\n/' | parallel -j 10 --block 1200 --recstart '>' --pipe "tee partialpseudofasta_{#}.txt >/dev/null" $ ls -l partialpseudofasta_* -rw-rw-r-- 1 staff staff 1092 Mar 29 16:25 partialpseudofasta_10.txt -rw-rw-r-- 1 staff staff 13 Mar 29 16:25 partialpseudofasta_11.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_12.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_13.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_14.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_15.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_16.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_17.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_18.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_19.txt -rw-rw-r-- 1 staff staff 1188 Mar 29 16:25 partialpseudofasta_1.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_20.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_2.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_3.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_4.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_5.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_6.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_7.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_8.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_9.txt $ parallel --version GNU parallel 20120322 Copyright (C) 2007,2008,2009,2010,2011,2012 Ole Tange and Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. GNU parallel comes with no warranty. Web site: http://www.gnu.org/software/parallel When using GNU Parallel for a publication please cite: O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47. From MAILER-DAEMON Thu Mar 29 20:41:01 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDPtk-0001Nw-Vg for mharc-parallel@gnu.org; Thu, 29 Mar 2012 20:41:00 -0400 Received: from eggs.gnu.org ([208.118.235.92]:34445) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDPth-0001MZ-O9 for parallel@gnu.org; Thu, 29 Mar 2012 20:40:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDPtf-00045e-UA for parallel@gnu.org; Thu, 29 Mar 2012 20:40:57 -0400 Received: from mail-gy0-f169.google.com ([209.85.160.169]:44034) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDPtf-00045Z-NB for parallel@gnu.org; Thu, 29 Mar 2012 20:40:55 -0400 Received: by ghrr18 with SMTP id r18so60892ghr.0 for ; Thu, 29 Mar 2012 17:40:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=4o0kNTmCQQOt+MeqLQNxy70xaWjUJLtBsc7RixYOSys=; b=JaAsCH7VFWTmWq1mA4KtBEeEF11h7ZGH5VWtj2Wv9HCS2hCu/XfVZVOi2kluLeggHF kcwgixuBuX8ctkmBi+SPqe0Lpp54Xi3vtNpax+3v2Opf3gJ0X5StCwsEAheVj0XW/QoP BAW0xlNj2MtrP+eHBu/qGcPTkYXTMjQixB7knEWrqNy6mnicr9qXAd/T9LZ3xpqpBep6 filV4vEX8hhX5cMRASIZFAWAXn+O7gbitXBmRcHQo+TAvUNUMtH/YZQZcW0fiU7+Pn0s FburHrejaUKU/voDmdGUq7krL8lP24m9UBu7y8QIP8fniRlKpnHq/Qjqo1O2k9eXyMLJ vi3Q== Received: by 10.68.197.65 with SMTP id is1mr4230318pbc.70.1333068052523; Thu, 29 Mar 2012 17:40:52 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.163.11 with HTTP; Thu, 29 Mar 2012 17:40:32 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Fri, 30 Mar 2012 02:40:32 +0200 X-Google-Sender-Auth: QrX-pt51X2afyTEjKQjHm5ahWTs Message-ID: Subject: Re: unexpected behavior when using GNU parallel with block and recstart to break up fasta file To: juncus@gmail.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 00:40:59 -0000 On Fri, Mar 30, 2012 at 1:54 AM, wrote: > Hello, > > I don't need to say how great GNU parallel is (GREAT!). Good to hear. > =A0But for the > first time, I have encountered a behavior I didn't expect from it. =A0I > am trying to break up a big input FASTA file (DNA sequence) using the > --block and --recstart options. =A0But it always seems to create ONE > more file than I really want. =A0I mean, if I have specified 10 jobs (-j > 10), and if the block size on the 10th job is still below my > specification (--block 1200), why does it make an 11th file? =A0This > means that 10 jobs in parallel run, and then 1 MORE job has to run to > get the last record. It sounds like: https://savannah.gnu.org/bugs/?34241 /Ole From MAILER-DAEMON Thu Mar 29 21:02:55 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDQEx-00051u-Uk for mharc-parallel@gnu.org; Thu, 29 Mar 2012 21:02:55 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33751) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDQEv-00051N-4n for parallel@gnu.org; Thu, 29 Mar 2012 21:02:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDQEt-0008UO-DQ for parallel@gnu.org; Thu, 29 Mar 2012 21:02:52 -0400 Received: from mail-yw0-f41.google.com ([209.85.213.41]:33317) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDQEt-0008UC-6G for parallel@gnu.org; Thu, 29 Mar 2012 21:02:51 -0400 Received: by yhr47 with SMTP id 47so69240yhr.0 for ; Thu, 29 Mar 2012 18:02:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=BhrKqL6p3+iM4GI7vQqK6GlItCndNtipO7AT26te3S8=; b=CtH4s1K9Ux4YVPGadrvjC8S7wUYxphx77X7JgyN9fn9WuIRjK/jSQgJVwlXVczSeKw V1tUEDUCFu+xvXC3DSnyWAn2cVSdJFJFxVjqLUUSHj7gTJ2J+9d7gMEyeewb/1+RmW1m QxVGa3yMV3HK95CD5FtRIx6TauktgiB6YBWBX2dZ2/59nmfAOGBsVGytxymfUxMS/QZK B1ejRtAdg7h+yKZD2DsmxXZ0Q9u2Gt98cBKPHnPvTXcL0Z9xr9bUqnJk9yFroVy6VHjc yM6H+tDyfGcVen+YLQPNQo23mWTn8Jnix6Zh5hKsgtAwdV0IBdnM3eFCfShd+RK8B86V g4Qg== Received: by 10.68.197.65 with SMTP id is1mr4347885pbc.70.1333069369145; Thu, 29 Mar 2012 18:02:49 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.163.11 with HTTP; Thu, 29 Mar 2012 18:02:28 -0700 (PDT) In-Reply-To: References: From: Ole Tange Date: Fri, 30 Mar 2012 03:02:28 +0200 X-Google-Sender-Auth: 8nRTK_ISbV_fPwv4UKFQrx3yGIg Message-ID: Subject: Re: Pipe status codes To: David Erickson Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.213.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 01:02:54 -0000 On Thu, Mar 29, 2012 at 9:23 PM, David Erickson wro= te: > Hi there- > I just discovered parallel and it looks great. Good to hear. How did you discover it? What could I have done to make you discover it earlier? > One existing use case > I have for commands I would like to run is something like: > > parallel 'cmd1 {} 2>&1 | tee somewhere.log' > > I have a couple questions regarding this use case: > > 1) Will parallel correctly honor my request to redirect stderr to stdout? This is easy for you to test, so I will leave that for you as an exercise. > 2) Is there some way to use --halt with the error code returned from > cmd1? =A0IE in bash I can do this with ${PIPESTATUS[0]}, because tee > will always return 0. GNU Parallel looks at the exit code, so you can do: parallel 'cmd1 {} 2>&1 | tee somewhere.log; exit ${PIPESTATUS[0]}' You can combine this with --halt if you need GNU Parallel not to spawn more jobs. If you just need to know how many jobs failed, the default (--halt 0) will work for you. >=A0Alternatively, can parallel log both stdout and > stderr for each command to a unique file with naming of my choice? You probably want one of these: ... | parallel cmd1 {} > all_stdout 2> all_stderr ... | parallel --tag cmd1 {} > all_stdout 2> all_stderr ... | parallel cmd1 {} \> {}_stdout 2\> {}_stderr ... | parallel cmd1 {} \> {#}_stdout 2\> {#}_stderr But more creative ways exists. If every 3rd argument is the stdout-name and the next is the stderr-name: ... | parallel -N3 cmd1 {1} \> {2} 2\> {3} If you have a file with the names of the stdout-names and another file with the stderr-filenames: ... | parallel --xapply cmd1 {1} \> {2} 2\> {3} :::: - stdoutfilenames stderrfilenames Also you may find --joblog useful. /Ole From MAILER-DAEMON Thu Mar 29 22:12:28 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDRKG-00016i-H4 for mharc-parallel@gnu.org; Thu, 29 Mar 2012 22:12:28 -0400 Received: from eggs.gnu.org ([208.118.235.92]:60568) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDRKD-00014s-Ht for parallel@gnu.org; Thu, 29 Mar 2012 22:12:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDRKB-00077v-9t for parallel@gnu.org; Thu, 29 Mar 2012 22:12:25 -0400 Received: from mail-lpp01m010-f41.google.com ([209.85.215.41]:51280) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDRKA-000779-Vu; Thu, 29 Mar 2012 22:12:23 -0400 Received: by lagz14 with SMTP id z14so244862lag.0 for ; Thu, 29 Mar 2012 19:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=jglQG0ofumP5FHuDOXMe8hgfSX/QucIhrq2mcCVjHp0=; b=B7DLySgyd8d7nd3uLNHwegj6TexasJ36mJRvgQRghRoaWzPpMGb9XXVa3WkHqbY0s4 LbuDkSRMhbwJ+0HX+Vyhn8AlGn3LOefg+HTTWF/DRd/T9Xrl2B7V4bMESApbwUtVn7TT kI+HRif82JEBC3KIWn4EC2JWFrSQI6CkFDJZ046v3GMr6ZeaTajnzVpCWPTklbDCy5Y1 Q83eGrlQwxqoAnijQ7ZbryRgaQ82M96HAfpvUfbZi5WNyWyh6RDN9tA8MOGZZeqo4GTg YtafQLeTv2ff8AlWyEDGUrHIIJZ+CpXInIZV3wFFrzvWK6OI80vq+MW/pkLFFjw5TZGz Os/w== MIME-Version: 1.0 Received: by 10.152.113.229 with SMTP id jb5mr483472lab.45.1333073539626; Thu, 29 Mar 2012 19:12:19 -0700 (PDT) Received: by 10.152.21.170 with HTTP; Thu, 29 Mar 2012 19:12:19 -0700 (PDT) In-Reply-To: References: Date: Thu, 29 Mar 2012 19:12:19 -0700 Message-ID: Subject: Re: Pipe status codes From: David Erickson To: Ole Tange Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.215.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 02:12:27 -0000 >On Thu, Mar 29, 2012 at 6:02 PM, Ole Tange wrote: > On Thu, Mar 29, 2012 at 9:23 PM, David Erickson w= rote: >> Hi there- >> I just discovered parallel and it looks great. > > Good to hear. How did you discover it? What could I have done to make > you discover it earlier? I found it through one of the comments in this thread: http://stackoverflow.com/questions/463963/parallel-processing-from-a-comman= d-queue-on-linux-bash-python-ruby-whateve I'm not sure what the key words you use on your website are, I believe I tried searching for bash job queue to begin with and it led me to the above. I'm surprised it doesn't come up in any conversation relating to a tool like this because this project has about a zillion more features than any of the others I ran in to in the same category. > >> One existing use case >> I have for commands I would like to run is something like: >> >> parallel 'cmd1 {} 2>&1 | tee somewhere.log' >> >> I have a couple questions regarding this use case: >> >> 1) Will parallel correctly honor my request to redirect stderr to stdout= ? > > This is easy for you to test, so I will leave that for you as an exercise= . > >> 2) Is there some way to use --halt with the error code returned from >> cmd1? =A0IE in bash I can do this with ${PIPESTATUS[0]}, because tee >> will always return 0. > > GNU Parallel looks at the exit code, so you can do: > > parallel 'cmd1 {} 2>&1 | tee somewhere.log; exit ${PIPESTATUS[0]}' > > You can combine this with --halt if you need GNU Parallel not to spawn > more jobs. If you just need to know how many jobs failed, the default > (--halt 0) will work for you. Perfect, with --halt 2 is what I'm looking for. > >>=A0Alternatively, can parallel log both stdout and >> stderr for each command to a unique file with naming of my choice? > > You probably want one of these: > > =A0... | parallel cmd1 {} > all_stdout 2> all_stderr > =A0... | parallel --tag cmd1 {} > all_stdout 2> all_stderr > =A0... | parallel cmd1 {} \> {}_stdout 2\> {}_stderr > =A0... | parallel cmd1 {} \> {#}_stdout 2\> {#}_stderr > > But more creative ways exists. If every 3rd argument is the > stdout-name and the next is the stderr-name: > > =A0... | parallel -N3 cmd1 {1} \> {2} 2\> {3} > > If you have a file with the names of the stdout-names and another file > with the stderr-filenames: > > =A0... | parallel --xapply cmd1 {1} \> {2} 2\> {3} :::: - > stdoutfilenames stderrfilenames > > Also you may find --joblog useful. I'm a little confused by these, what I'd like is the stdout and stderr to be combined into a single file for each job, where I can supply the name of that file per job (something like {}.log or similar). Doable? Thanks! David From MAILER-DAEMON Fri Mar 30 04:41:36 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDXOq-0000Gz-GJ for mharc-parallel@gnu.org; Fri, 30 Mar 2012 04:41:36 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41131) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDXOl-0000GY-Bo for parallel@gnu.org; Fri, 30 Mar 2012 04:41:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDXOc-0002GF-NI for parallel@gnu.org; Fri, 30 Mar 2012 04:41:30 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:46175 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDXOc-0002Fg-Hg for parallel@gnu.org; Fri, 30 Mar 2012 04:41:22 -0400 Received: from smtpauth-ha.cluster.uni-frankfurt.de ([10.1.1.120]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDXOZ-0007xZ-Gk for parallel@gnu.org; Fri, 30 Mar 2012 10:41:19 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDXOX-0002bt-8R for parallel@gnu.org; Fri, 30 Mar 2012 10:41:17 +0200 Message-ID: <4F7571AC.5090604@med.uni-frankfurt.de> Date: Fri, 30 Mar 2012 10:41:16 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: parallel@gnu.org Subject: Re: BUG: swap_activity broken in v20120322 References: <4F7058AB.70005@med.uni-frankfurt.de> In-Reply-To: <4F7058AB.70005@med.uni-frankfurt.de> X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 08:41:35 -0000 Hi Ole, you did not (yet) comment the BUG, so I've had a closer look at the reported issue: vmstat 1 2 2>/dev/null | tail -n1 | awk '{print $7*$8}' || vm_stat 1 | head -n 3 | tail -n1 | awk '{print $9*$10}' The idea seems to be "run 'vmstat' and run 'vm_stat' in case 'vmstat' failed". But in fact it is only *one* commandline where 'vm_stat' is run in case *awk* failed. The following commands demonstrate this: $ false | tail | true || date # <-- date is not run $ true | tail | false || date # <-- date is run \ \ \__ vmstat \__ awk If somebody would reply with the output of 'vm_stat', I'd offer a patch to fix the issue. (I don't have a MAC.) Thomas From MAILER-DAEMON Fri Mar 30 08:30:48 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDaye-0003iu-Ic for mharc-parallel@gnu.org; Fri, 30 Mar 2012 08:30:48 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41046) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDayY-0003hZ-Ds for parallel@gnu.org; Fri, 30 Mar 2012 08:30:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDayS-0002uG-1H for parallel@gnu.org; Fri, 30 Mar 2012 08:30:41 -0400 Received: from www5.pairlite.com ([64.130.10.15]:57534 helo=smtp.benizi.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDayR-0002u6-QR for parallel@gnu.org; Fri, 30 Mar 2012 08:30:35 -0400 Received: from localhost (localhost [127.0.0.1]) by smtp.benizi.com (Postfix) with ESMTPSA id B51012B1AE; Fri, 30 Mar 2012 08:30:07 -0400 (EDT) Date: Fri, 30 Mar 2012 08:30:07 -0400 (EDT) From: "Benjamin R. Haskell" To: Thomas Sattler Subject: Re: BUG: swap_activity broken in v20120322 In-Reply-To: <4F7571AC.5090604@med.uni-frankfurt.de> Message-ID: References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> User-Agent: Alpine 2.01 (LNX 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 64.130.10.15 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 12:30:46 -0000 On Fri, 30 Mar 2012, Thomas Sattler wrote: > Hi Ole, > > you did not (yet) comment the BUG, so I've had a closer look at the > reported issue: > > vmstat 1 2 2>/dev/null | tail -n1 | awk '{print $7*$8}' || > vm_stat 1 | head -n 3 | tail -n1 | awk '{print $9*$10}' Better `awk` usage: vmstat 1 2 2>/dev/null | awk 'END {print $7*$8}' || vm_stat 1 | awk 'NR == 3 {print $9*$10 ; exit}' > The idea seems to be "run 'vmstat' and run 'vm_stat' in case 'vmstat' > failed". But in fact it is only *one* commandline where 'vm_stat' is > run in case *awk* failed. > > The following commands demonstrate this: > > $ false | tail | true || date # <-- date is not run > $ true | tail | false || date # <-- date is run > \ \ > \__ vmstat \__ awk > > If somebody would reply with the output of 'vm_stat', I'd offer a > patch to fix the issue. (I don't have a MAC.) osx-server$ vm_stat 1 Mach Virtual Memory Statistics: (page size of 4096 bytes, cache hits 0%) free active spec inactive wire faults copy 0fill reactive pageins pageout 70547 442624 29441 88468 415463 14321M 518664K 2448176K 683760 1460755 367311 70658 442657 29440 88468 415463 139 0 136 0 1 0 70729 442660 29440 88468 415463 145 0 73 0 0 0 70461 442660 29440 88468 415463 388 69 134 0 0 0 70478 442649 29440 88468 415463 16 0 6 0 0 0 [...etc...] Not sure $9*$10 is the right expression to use. "reactive * pageins"? On a very lightly-loaded server, that gets: 998807890080 -- Best, Ben From MAILER-DAEMON Fri Mar 30 08:44:21 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDbBl-0008Bi-A7 for mharc-parallel@gnu.org; Fri, 30 Mar 2012 08:44:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:60599) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDbBj-0008BF-4p for parallel@gnu.org; Fri, 30 Mar 2012 08:44:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDbBd-0006ki-0B for parallel@gnu.org; Fri, 30 Mar 2012 08:44:18 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:34409 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDbBc-0006kb-Q1 for parallel@gnu.org; Fri, 30 Mar 2012 08:44:12 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDbBa-0004Wx-9Q; Fri, 30 Mar 2012 14:44:10 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDbBW-0006nZ-EE; Fri, 30 Mar 2012 14:44:06 +0200 Message-ID: <4F75AA95.6070207@med.uni-frankfurt.de> Date: Fri, 30 Mar 2012 14:44:05 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: "Benjamin R. Haskell" Subject: Re: BUG: swap_activity broken in v20120322 References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 12:44:20 -0000 >> vmstat 1 2 2>/dev/null | tail -n1 | awk '{print $7*$8}' || >> vm_stat 1 | head -n 3 | tail -n1 | awk '{print $9*$10}' > > Better `awk` usage: > > vmstat 1 2 2>/dev/null | awk 'END {print $7*$8}' || > vm_stat 1 | awk 'NR == 3 {print $9*$10 ; exit}' Still the same problem: Whether 'vm_stat' is run depends on the success of 'awk', not 'vmstat'. (At least under bash.) >> If somebody would reply with the output of 'vm_stat', I'd offer a >> patch to fix the issue. (I don't have a MAC.) > > osx-server$ vm_stat 1 > Mach Virtual Memory Statistics: (page size of 4096 bytes, cache hits 0%) > free active spec inactive wire faults copy 0fill reactive pageins pageout > 70547 442624 29441 88468 415463 14321M 518664K 2448176K 683760 1460755 367311 > 70658 442657 29440 88468 415463 139 0 136 0 1 0 > 70729 442660 29440 88468 415463 145 0 73 0 0 0 > 70461 442660 29440 88468 415463 388 69 134 0 0 0 > 70478 442649 29440 88468 415463 16 0 6 0 0 0 > [...etc...] This was the fourth reply with the output of 'vm_stat', so here is some code that should do the job: { vmstat 1 2> /dev/null || vm_stat 1; } | awk ' NR!=4{next} NF==16{print $7*$8} NF==11{print $10*$11} {exit} ' I'm running 'vmstat' (and 'vm_stat' in case 'vmstat' failed) and pipe the output to 'awk' which knows where the relevant information is: vmstat (16 fields per line): field7 * field8 vm_stat (11 fields per line): field10 * field11 By the way: In all four replys I got, the relevant information ("pageins" and "pageout") was in field 10 / 11, never 9 / 10. Also I had to use line number 4 as the output of 'vmstat' and 'vm_stat' both have two lines of headers and have non-zero values in their first data line. (see above) Thomas P.S.: Why do we use 'awk' within 'perl'? From MAILER-DAEMON Fri Mar 30 10:24:58 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDcl8-0004P1-Ib for mharc-parallel@gnu.org; Fri, 30 Mar 2012 10:24:58 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45430) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDcl1-0004Mc-DI for parallel@gnu.org; Fri, 30 Mar 2012 10:24:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDckv-0006f3-28 for parallel@gnu.org; Fri, 30 Mar 2012 10:24:50 -0400 Received: from www5.pairlite.com ([64.130.10.15]:50715 helo=smtp.benizi.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDcku-0006eY-R3 for parallel@gnu.org; Fri, 30 Mar 2012 10:24:44 -0400 Received: from localhost (localhost [127.0.0.1]) by smtp.benizi.com (Postfix) with ESMTPSA id ECC6C2B1AE; Fri, 30 Mar 2012 10:24:16 -0400 (EDT) Date: Fri, 30 Mar 2012 10:24:16 -0400 (EDT) From: "Benjamin R. Haskell" To: Thomas Sattler Subject: Re: BUG: swap_activity broken in v20120322 In-Reply-To: <4F75AA95.6070207@med.uni-frankfurt.de> Message-ID: References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> <4F75AA95.6070207@med.uni-frankfurt.de> User-Agent: Alpine 2.01 (LNX 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 64.130.10.15 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 14:24:57 -0000 On Fri, 30 Mar 2012, Thomas Sattler wrote: >>> vmstat 1 2 2>/dev/null | tail -n1 | awk '{print $7*$8}' || >>> vm_stat 1 | head -n 3 | tail -n1 | awk '{print $9*$10}' >> >> Better `awk` usage: >> >> vmstat 1 2 2>/dev/null | awk 'END {print $7*$8}' || >> vm_stat 1 | awk 'NR == 3 {print $9*$10 ; exit}' > > Still the same problem: Whether 'vm_stat' is run depends on the > success of 'awk', not 'vmstat'. (At least under bash.) Right. But at least we're not relying on `head` and `tail`. >>> If somebody would reply with the output of 'vm_stat', I'd offer a >>> patch to fix the issue. (I don't have a MAC.) >> >> osx-server$ vm_stat 1 >> Mach Virtual Memory Statistics: (page size of 4096 bytes, cache hits 0%) >> free active spec inactive wire faults copy 0fill reactive pageins pageout >> 70547 442624 29441 88468 415463 14321M 518664K 2448176K 683760 1460755 367311 >> 70658 442657 29440 88468 415463 139 0 136 0 1 0 >> 70729 442660 29440 88468 415463 145 0 73 0 0 0 >> 70461 442660 29440 88468 415463 388 69 134 0 0 0 >> 70478 442649 29440 88468 415463 16 0 6 0 0 0 >> [...etc...] > > This was the fourth reply with the output of 'vm_stat', (Were any of those replies on-list? The Parallel mailing list is the only one I'm on where I feel as if I'm consistently missing messages, despite having a filter to keep them out of spam...) > so here is some code that should do the job: > > { vmstat 1 2> /dev/null || vm_stat 1; } | awk ' > NR!=4{next} > NF==16{print $7*$8} > NF==11{print $10*$11} > {exit} > ' > I'm running 'vmstat' (and 'vm_stat' in case 'vmstat' failed) and pipe > the output to 'awk' which knows where the relevant information is: > > vmstat (16 fields per line): field7 * field8 > vm_stat (11 fields per line): field10 * field11 Looks good to me. > By the way: In all four replys I got, the relevant information > ("pageins" and "pageout") was in field 10 / 11, never 9 / 10. > > Also I had to use line number 4 as the output of 'vmstat' and > 'vm_stat' both have two lines of headers and have non-zero values in > their first data line. (see above) Sounds reasonable. Presumably the first line is the total since boot time, which doesn't provide any useful information about the current state of the system. > Thomas > > P.S.: Why do we use 'awk' within 'perl'? Because it runs on the remote machine, and `awk` has less overhead than `perl`. (Though since it's waiting for at least 1 second to gather stats, it may not be a significant overhead.) -- Best, Ben From MAILER-DAEMON Fri Mar 30 10:30:28 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDcqS-0006H9-QZ for mharc-parallel@gnu.org; Fri, 30 Mar 2012 10:30:28 -0400 Received: from eggs.gnu.org ([208.118.235.92]:58348) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDcqQ-0006G3-6I for parallel@gnu.org; Fri, 30 Mar 2012 10:30:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDcqL-0008Nf-8Z for parallel@gnu.org; Fri, 30 Mar 2012 10:30:25 -0400 Received: from mail-gx0-f169.google.com ([209.85.161.169]:43837) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDcqL-0008MU-2E for parallel@gnu.org; Fri, 30 Mar 2012 10:30:21 -0400 Received: by ggeq1 with SMTP id q1so408956gge.0 for ; Fri, 30 Mar 2012 07:30:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=yInKlMOejGBc13PR+qxSCuAOPCmIgellmd4WO5ozheM=; b=QBZZJo1/ZhGStduu41snWot2o4Z/ZoN+qwvdEjrP4/+ofBFEk0sYMxBOnk6UBSpeDN 2CuqTD7gxSywm32wDK19gn5/ABAJ7q/J8ykurY5UW8bAgaLgXFUP0iNCQU84TNbkXtbe 0ccOzYjOMicuOsE6ihP0OBuPeBAqZWURrZRBA53pZlsom1sGjWlKiV1jbv9fucUlRYb2 y4Ct308E0GwfiWK8mqmQNVCFVplUSbR4oB62AhXOKDEqecocO90iSbEcgSqflsgCspS4 ezGO1bUaCda5zt+1VZODcljqnGJl1uo9CKNfTYCDcp7nvLE9JgUP+GdkF98AzjAuLTeX rh8A== Received: by 10.68.201.73 with SMTP id jy9mr9999255pbc.35.1333117819110; Fri, 30 Mar 2012 07:30:19 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.163.11 with HTTP; Fri, 30 Mar 2012 07:29:57 -0700 (PDT) In-Reply-To: References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 30 Mar 2012 16:29:57 +0200 X-Google-Sender-Auth: isPnHirHcNkoKnQkK6GWgryDSdM Message-ID: Subject: Re: BUG: swap_activity broken in v20120322 To: "Benjamin R. Haskell" Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.161.169 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 14:30:27 -0000 On Fri, Mar 30, 2012 at 2:30 PM, Benjamin R. Haskell wrote: > Not sure $9*$10 is the right expression to use. "reactive * pageins"? On a > very lightly-loaded server, that gets: 998807890080 The wanted expression is pageouts * pageins. The idea is that pageouts are OK (server pages out unused pages but not because of memory limitation) and pageins are OK (server coming back from a memory limited situation), but not both at the same time. /Ole From MAILER-DAEMON Fri Mar 30 10:46:42 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDd6A-0002Rd-JD for mharc-parallel@gnu.org; Fri, 30 Mar 2012 10:46:42 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50677) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDd63-0002Qs-E7 for parallel@gnu.org; Fri, 30 Mar 2012 10:46:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDd5y-0003ay-M7 for parallel@gnu.org; Fri, 30 Mar 2012 10:46:34 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:40549 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDd5y-0003ao-Ff for parallel@gnu.org; Fri, 30 Mar 2012 10:46:30 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDd5v-00022w-UT; Fri, 30 Mar 2012 16:46:27 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDd5t-00087N-Dg; Fri, 30 Mar 2012 16:46:25 +0200 Message-ID: <4F75C740.3000806@med.uni-frankfurt.de> Date: Fri, 30 Mar 2012 16:46:24 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: "Benjamin R. Haskell" Subject: Re: BUG: swap_activity broken in v20120322 References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> <4F75AA95.6070207@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 14:46:41 -0000 >> This was the fourth reply with the output of 'vm_stat', > > (Were any of those replies on-list?) No, only yours came via the list. >> P.S.: Why do we use 'awk' within 'perl'? > > Because it runs on the remote machine, and `awk` has less > overhead than `perl`. (Though since it's waiting for at > least 1 second to gather stats, it may not be a > significant overhead.) Perl is not my prefered language so I might be wrong with the following, but I thought that sub "swap_activity" used 'qx' to execute "$swap_activity" on the local machine. Reading the code again I see something I ignored last time: 'qx' spawns a background process to get (and save!) these information. At first I thought it would read them back into 'parallel'. Thomas From MAILER-DAEMON Fri Mar 30 11:10:40 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDdTM-0000sh-24 for mharc-parallel@gnu.org; Fri, 30 Mar 2012 11:10:40 -0400 Received: from eggs.gnu.org ([208.118.235.92]:54196) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDdTG-0000sa-M7 for parallel@gnu.org; Fri, 30 Mar 2012 11:10:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDdT8-0000aA-52 for parallel@gnu.org; Fri, 30 Mar 2012 11:10:34 -0400 Received: from mailout.rz.uni-frankfurt.de ([141.2.22.233]:41548 helo=mailout.cluster.uni-frankfurt.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDdT7-0000a1-Uk for parallel@gnu.org; Fri, 30 Mar 2012 11:10:26 -0400 Received: from smtpauth2-ha.cluster.uni-frankfurt.de ([10.1.1.121]) by mailout.cluster.uni-frankfurt.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDdT5-00038E-9F; Fri, 30 Mar 2012 17:10:23 +0200 Received: from ganymed.kgu.de ([141.2.203.253] helo=[192.168.161.210]) by fuenfundzwanzig with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.71) (envelope-from ) id 1SDdT3-0001bc-9H; Fri, 30 Mar 2012 17:10:21 +0200 Message-ID: <4F75CCDC.3080601@med.uni-frankfurt.de> Date: Fri, 30 Mar 2012 17:10:20 +0200 From: Thomas Sattler User-Agent: Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120312 Thunderbird/11.0 MIME-Version: 1.0 To: Ole Tange Subject: Re: parallel stops working for no obvious reason References: <4F703B4A.1000509@med.uni-frankfurt.de> In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: multipart/mixed; boundary="------------000306090503010708090708" X-MailScanner: Found to be clean X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 141.2.22.233 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 15:10:39 -0000 This is a multi-part message in MIME format. --------------000306090503010708090708 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit >> This happend last week with v20120222, I'll give it another >> try with v20120322 today. _creating_ these files seems to work with v20120322. (I created nearly 4,400,000 files by now.) But now deleting them is still an issue. Luckily it's much easier to trigger this way. :-) > As you probably can imagine that is hard to reproduce. See if > you can make smaller example fail - preferably something that > can run on smaller machines. I wrote a small script that shows the problem. It completes in less than 10 seconds on my desktop (two cores), but hangs (read: "does not complete within hours") on two other machines (8/32 cores). The problem seems to be within load handling on multicore machines. (All machines run successfully when removing "--load 100%".) Here's a sample output: --- 8< --------------------------------------------------------------- $ time pissue PARALLEL=--load 100% cores on dragon: 2 16:26:50 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 4 files: creating (0s) deleting (0s) 16:26:50 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 9 files: creating (0s) deleting (0s) 16:26:50 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 16 files: creating (1s) deleting (0s) 16:26:51 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 25 files: creating (0s) deleting (1s) 16:26:52 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 36 files: creating (0s) deleting (0s) 16:26:52 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 49 files: creating (1s) deleting (0s) 16:26:53 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 64 files: creating (0s) deleting (1s) 16:26:54 up 9 days, 39 min, 2 users, load average: 0.15, 0.11, 0.14 81 files: creating (0s) deleting (1s) 16:26:55 up 9 days, 39 min, 2 users, load average: 0.30, 0.14, 0.15 100 files: creating (0s) deleting (1s) real 0m6.223s user 0m2.873s sys 0m0.813s --------------------------------------------------------------- >8 --- Thomas --------------000306090503010708090708 Content-Type: text/plain; charset=UTF-8; name="pissue" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pissue" #!/bin/bash # use a tmp-dir in a RAM disk mkdir -p /dev/shm/pissue || exit cd /dev/shm/pissue || exit # this seems to be important export PARALLEL="--load 100%" echo PARALLEL=$PARALLEL echo -n "cores on $HOSTNAME: " parallel --number-of-cores echo for i in $(seq 2 10); do i2=$[i*i]; uptime echo -n "$i2 files: creating " SECONDS=0; seq $i2 | parallel -X touch echo -n "(${SECONDS}s) deleting " SECONDS=0; ls | parallel -X rm echo -e "(${SECONDS}s)\n" done rmdir /dev/shm/pissue --------------000306090503010708090708-- From MAILER-DAEMON Fri Mar 30 11:44:21 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDdzx-0007Bi-N1 for mharc-parallel@gnu.org; Fri, 30 Mar 2012 11:44:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38507) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDdzq-0007An-1P for parallel@gnu.org; Fri, 30 Mar 2012 11:44:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDdzj-000807-IU for parallel@gnu.org; Fri, 30 Mar 2012 11:44:13 -0400 Received: from mail-yw0-f41.google.com ([209.85.213.41]:38415) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDdzj-0007zk-Cj for parallel@gnu.org; Fri, 30 Mar 2012 11:44:07 -0400 Received: by yhr47 with SMTP id 47so499217yhr.0 for ; Fri, 30 Mar 2012 08:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=6PNEE0W4ab6lkkAvPIhZtyCAT++g7Fa/69GzvRXWN2A=; b=fxEYrMbP19kUDul2ABNThaDX5mWTbvivBlix47XXRTxLCdP60UDjCayGMG8cDOn94O 30rFVw7/2KLHL5OPUj78QrW50IaDn5AnLaeD89uXxUhWGZVcGDDETUuNBrIoeoIFqLM0 uikiqMfeaikL4Nc0sYdV7ijuJWC5Y31nf1mXw6depPZ3eMh8Up2DC1OJHZH2GeoOogq+ lAwwfjAyxI7u2t1I5zeCG1yHGXLLDNI8gJTV+H0gUgw1a5BqEYQnP1Sb/IR461tpxFkE qQ6x7kTk46jr+IhIK6sIIZjyZ46JeHwrK2Dufn6m6TMxuRFWo3WPs3oK2yRjd6oxLFZI 6cRw== Received: by 10.68.201.73 with SMTP id jy9mr10543413pbc.35.1333122245310; Fri, 30 Mar 2012 08:44:05 -0700 (PDT) MIME-Version: 1.0 Sender: ole.tange@gmail.com Received: by 10.143.163.11 with HTTP; Fri, 30 Mar 2012 08:43:45 -0700 (PDT) In-Reply-To: <4F75C740.3000806@med.uni-frankfurt.de> References: <4F7058AB.70005@med.uni-frankfurt.de> <4F7571AC.5090604@med.uni-frankfurt.de> <4F75AA95.6070207@med.uni-frankfurt.de> <4F75C740.3000806@med.uni-frankfurt.de> From: Ole Tange Date: Fri, 30 Mar 2012 17:43:45 +0200 X-Google-Sender-Auth: BeDHnnMiJ7FPfrEeOfCYBsk-A0k Message-ID: Subject: Re: BUG: swap_activity broken in v20120322 To: Thomas Sattler Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.213.41 Cc: parallel@gnu.org X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 15:44:19 -0000 On Fri, Mar 30, 2012 at 4:46 PM, Thomas Sattler wrote: >>> P.S.: Why do we use 'awk' within 'perl'? >> >> Because it runs on the remote machine, and `awk` has less >> overhead than `perl`. =A0(Though since it's waiting for at >> least 1 second to gather stats, it may not be a >> significant overhead.) > > Perl is not my prefered language so I might be wrong with the > following, but I thought that sub "swap_activity" used 'qx' > to execute "$swap_activity" on the local machine. > > Reading the code again I see something I ignored last time: > 'qx' spawns a background process to get (and save!) these > information. At first I thought it would read them back > into 'parallel'. The reason for this is that Parallel should not wait for the load to be computed, if it for some reason takes a lot of time. So instead it will spawn off a process to do this and read the output back next time. /Ole From MAILER-DAEMON Fri Mar 30 12:13:38 2012 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1SDeSI-00024D-2V for mharc-parallel@gnu.org; Fri, 30 Mar 2012 12:13:38 -0400 Received: from eggs.gnu.org ([208.118.235.92]:44761) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDeSA-00022d-Ek for parallel@gnu.org; Fri, 30 Mar 2012 12:13:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SDeS5-0006qn-Fq for parallel@gnu.org; Fri, 30 Mar 2012 12:13:30 -0400 Received: from mail-yx0-f169.google.com ([209.85.213.169]:64745) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SDeS5-0006qa-5E; Fri, 30 Mar 2012 12:13:25 -0400 Received: by yenm8 with SMTP id m8so528561yen.0 for ; Fri, 30 Mar 2012 09:13:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=B/uP9g+B34GfJDZPtoGeW1L0DwGHir2HZZ61RnRuTZk=; b=fnJLEy3lPwNwTKtqDduvaozvCM1rb4jXIuqOMyyOQXhKQPJWqGlLmLap1vO2y/YxYI ddZHUXEpsO4lOvfeHjPK1hxLnZ71wHX6P6HY2AU0PxWF7yQwCwOf4k60P+EE7yNfzHbd mOwUc7RRzDmeh5+EPwOeIbOKe3JkweeICYrAcO9tCES4LfWkrJ/nO0BFGXk0kfvMdoCB SNgZQI7n64dKevULku9SDPIcsq+/KnJRIhbJOKHsBDzn+5Mf3TE/2goVk27eMlIT3J5j fQ4nID9qiHZuNP28cmQhc03mXLuseyiJDv3rKM0L9VF26CAiWPzDja2G4bflNHo9jSmn EYJQ== MIME-Version: 1.0 Received: by 10.60.4.199 with SMTP id m7mr3544355oem.65.1333124002739; Fri, 30 Mar 2012 09:13:22 -0700 (PDT) Sender: owen.solberg@gmail.com Received: by 10.182.44.37 with HTTP; Fri, 30 Mar 2012 09:13:22 -0700 (PDT) In-Reply-To: References: Date: Fri, 30 Mar 2012 09:13:22 -0700 X-Google-Sender-Auth: tFeNrw9G0UqwXHaWPyEYQ4oxmoY Message-ID: Subject: Re: unexpected behavior when using GNU parallel with block and recstart to break up fasta file From: juncus@gmail.com To: Ole Tange , parallel@gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.213.169 X-BeenThere: parallel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: GNU Parallel Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2012 16:13:36 -0000 Hi Ole, thanks for the reply. Not quite. True, I am observing the same thing (empty files 12 through 20 below), but what is bothering me is file #11, which has 13 bytes, and could have easily fit into file #10 (1092 bytes) and still been well below the 1200 threshold. Another way to have asked this question might have been: Will parallel always assume the last record is partial, if you only provide recstart? Because for some file types, it might not be feasible to provide a recend (like FASTA files, where all you can rely on is the ">" which marks the start of the header for each record.) So in these situations will parallel always kick the last single record into its own solitary process? -rw-rw-r-- 1 staff staff 1092 Mar 29 16:25 partialpseudofasta_10.txt -rw-rw-r-- 1 staff staff 13 Mar 29 16:25 partialpseudofasta_11.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_12.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_13.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_14.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_15.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_16.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_17.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_18.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_19.txt -rw-rw-r-- 1 staff staff 1188 Mar 29 16:25 partialpseudofasta_1.txt -rw-rw-r-- 1 staff staff 0 Mar 29 16:25 partialpseudofasta_20.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_2.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_3.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_4.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_5.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_6.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_7.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_8.txt -rw-rw-r-- 1 staff staff 1200 Mar 29 16:25 partialpseudofasta_9.txt Thanks Owen On Thu, Mar 29, 2012 at 5:40 PM, Ole Tange wrote: > On Fri, Mar 30, 2012 at 1:54 AM, =A0 wrote: >> Hello, >> >> I don't need to say how great GNU parallel is (GREAT!). > > Good to hear. > >> =A0But for the >> first time, I have encountered a behavior I didn't expect from it. =A0I >> am trying to break up a big input FASTA file (DNA sequence) using the >> --block and --recstart options. =A0But it always seems to create ONE >> more file than I really want. =A0I mean, if I have specified 10 jobs (-j >> 10), and if the block size on the 10th job is still below my >> specification (--block 1200), why does it make an 11th file? =A0This >> means that 10 jobs in parallel run, and then 1 MORE job has to run to >> get the last record. > > It sounds like: https://savannah.gnu.org/bugs/?34241 > > /Ole