[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Using parallel over several computers

From: Anders Lind
Subject: Using parallel over several computers
Date: Tue, 14 Mar 2017 10:54:00 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.7.1

Hi GNU Parallel mailing list.

I'm looking for a way to run a large number of jobs (in parallel) on several computers. The job consists of running an analysis on several thousand input files. They way I have this set up right now is to split the number of input files into chunks and move them to the various computers, and then run the analysis using GNU parallel on each machine. This has the down side that I have to keep track which computers are doing what.

I could perhaps set this up using the ssh functionality of parallel, but I would need to be able to on the fly stop some machines from running jobs, since the computers belong to co-workers who sometimes need their computers for their own work.

My idea was that perhaps I can have a file containing the paths to all the files I want to analyze, which is accesible to all computers on the network. I would then like to set up separate parallel jobs on the various computers that would continuously pull paths from this input file, and then run analysis on it.

Of course the issue here is that I need to be able to keep track of which paths have been used. Having several computers access it at the same time seem like it would lead to I/O issues.

Am I missing some more obvious solution?

Any help very much appreciated. And I hope I am not abusing the purpose of this mailing list.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]