octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Patching octave-parallel so it allows more than one master* on acomp


From: Michael Creel
Subject: Re: Patching octave-parallel so it allows more than one master* on acomputer
Date: Thu, 10 Mar 2011 16:26:59 +0100

On Thu, Mar 10, 2011 at 4:00 PM, Oz Nahum Tiram <address@hidden> wrote:
> Hi Everyone,
>
> I am not at all a C++ dev, so this is a call for help regarding GNU
> Octave: In octave forge there are a few packages for parallel
> processing:
>
> http://octave.sourceforge.net/multicore/index.html
> http://octave.sourceforge.net/parallel/index.html
> http://octave.sourceforge.net/general/function/parcellfun.html
> http://octave.sourceforge.net/general/function/pararrayfun.html
> http://octave.sourceforge.net/openmpi_ext/index.html
>
> The first package allows processing on multicores, which is great if
> you have expansive machine with more than 2 cores ... The second one
> allows you to use octave on many cores with bewolf cluster, but allows
> the usage of one process per node. I.E if you cluster compute nodes
> have more than one CPU, you can't use them - UNLESS, you write code
> using functions from 1, 2 or 3. The last solution is totally not user
> friendly, and I can't use it.
>
> The problem with writing code that uses 1,2 or 3 is that you have to
> invent wrapper functions that will be able to run on unknown number of
> CPU and compute nodes if you want your code to be portable ...
>
> What I would like to to do, is to enable more than one compute server
> on each node. I.E, one than more octave process on each node using
> octave parallel. At the moment this is not possible because when the
> function server.m starts it looks for a pid lock file. Commands are
> sent to slave machines over TCP connections to port 12502, data is
> sent between machines over TCP connections to port 12501.
>
> The following lines in the file pserver.cc in the octave-parallel are
> responsible for the lock:
>
>  DEFUN_DLD (pserver,,,
>  "pserver\n\
>  \`enter code here`n\
>   Connect hosts and return sockets.")
>  {
>  FILE *pidfile=0;
>  int ppid,len=118;
>  char hostname[120],pidname[128],errname[128],bakname[128];
>  struct stat fstat;
>
>  gethostname(hostname,len);
>  sprintf(pidname,"/tmp/.octave-%s.pid",hostname);
>  if(stat(pidname,&fstat)==0){
> std::cerr << "octave : "<<hostname<<": server already running"<<std::endl;
> clean_up_and_exit (1);
>  }
>
> So, after the long introduction, comes my question: 1. Is possible to
> remove this lock check ? 2. Will it be possible to make the master
> communicate with more than one slave on each host ? 3. And finally, is
> there someone willing to help me implement it (I will help in testing
> and documenting ... unfortunately, I can hardly write c++)?
>
> I am not subscribed to octave maintainers ... so please include my
> private email if you decide to answer me.
>
> many thanks,
> Oz Nahum
>
> *In subject of mail by master  I mean of course a compute slave
> instance (called by master.m which is highly confusing)...
>

Regarding openmpi_ext, I find that it's fairly easy to use if you have
had experience with MPI. It is a nice solution in that you can mix
numerous heterogeneous machines and use all cores on all machines in a
simple way, maintaining code portability. The pelicanhpc Linux distro
has this installed with working examples (kernel_example.m, for
instance)

Here's the example running on 32 cores over 8 machines:

address@hidden:~$ mpirun -np 33 --hostfile /home/user/tmp/bhosts octave -q
--eval "kernel_example(20000, true, false)"
Just received fits for points 1 though 625 out of 20000
Just received fits for points 626 though 1250 out of 20000
Just received fits for points 1251 though 1875 out of 20000
Just received fits for points 1876 though 2500 out of 20000
Just received fits for points 2501 though 3125 out of 20000
Just received fits for points 3126 though 3750 out of 20000
Just received fits for points 3751 though 4375 out of 20000
Just received fits for points 4376 though 5000 out of 20000
Just received fits for points 5001 though 5625 out of 20000
Just received fits for points 5626 though 6250 out of 20000
Just received fits for points 6251 though 6875 out of 20000
Just received fits for points 6876 though 7500 out of 20000
Just received fits for points 7501 though 8125 out of 20000
Just received fits for points 8126 though 8750 out of 20000
Just received fits for points 8751 though 9375 out of 20000
Just received fits for points 9376 though 10000 out of 20000
Just received fits for points 10001 though 10625 out of 20000
Just received fits for points 10626 though 11250 out of 20000
Just received fits for points 11251 though 11875 out of 20000
Just received fits for points 11876 though 12500 out of 20000
Just received fits for points 12501 though 13125 out of 20000
Just received fits for points 13126 though 13750 out of 20000
Just received fits for points 13751 though 14375 out of 20000
Just received fits for points 14376 though 15000 out of 20000
Just received fits for points 15001 though 15625 out of 20000
Just received fits for points 15626 though 16250 out of 20000
Just received fits for points 16251 though 16875 out of 20000
Just received fits for points 16876 though 17500 out of 20000
Just received fits for points 17501 though 18125 out of 20000
Just received fits for points 18126 though 18750 out of 20000
Just received fits for points 18751 though 19375 out of 20000
Just received fits for points 19376 though 20000 out of 20000
Just received fits for points 1 though 625 out of 20000
Just received fits for points 626 though 1250 out of 20000
Just received fits for points 1251 though 1875 out of 20000
Just received fits for points 1876 though 2500 out of 20000
Just received fits for points 2501 though 3125 out of 20000
Just received fits for points 3126 though 3750 out of 20000
Just received fits for points 3751 though 4375 out of 20000
Just received fits for points 4376 though 5000 out of 20000
Just received fits for points 5001 though 5625 out of 20000
Just received fits for points 5626 though 6250 out of 20000
Just received fits for points 6251 though 6875 out of 20000
Just received fits for points 6876 though 7500 out of 20000
Just received fits for points 7501 though 8125 out of 20000
Just received fits for points 8126 though 8750 out of 20000
Just received fits for points 8751 though 9375 out of 20000
Just received fits for points 9376 though 10000 out of 20000
Just received fits for points 10001 though 10625 out of 20000
Just received fits for points 10626 though 11250 out of 20000
Just received fits for points 11251 though 11875 out of 20000
Just received fits for points 11876 though 12500 out of 20000
Just received fits for points 12501 though 13125 out of 20000
Just received fits for points 13126 though 13750 out of 20000
Just received fits for points 13751 though 14375 out of 20000
Just received fits for points 14376 though 15000 out of 20000
Just received fits for points 15001 though 15625 out of 20000
Just received fits for points 15626 though 16250 out of 20000
Just received fits for points 16251 though 16875 out of 20000
Just received fits for points 16876 though 17500 out of 20000
Just received fits for points 17501 though 18125 out of 20000
Just received fits for points 18126 though 18750 out of 20000
Just received fits for points 18751 though 19375 out of 20000
Just received fits for points 19376 though 20000 out of 20000
Just received fits for points 1 though 12 out of 400
Just received fits for points 13 though 24 out of 400
Just received fits for points 25 though 36 out of 400
Just received fits for points 37 though 48 out of 400
Just received fits for points 49 though 60 out of 400
Just received fits for points 61 though 72 out of 400
Just received fits for points 73 though 84 out of 400
Just received fits for points 85 though 96 out of 400
Just received fits for points 97 though 108 out of 400
Just received fits for points 109 though 120 out of 400
Just received fits for points 121 though 132 out of 400
Just received fits for points 133 though 144 out of 400
Just received fits for points 145 though 156 out of 400
Just received fits for points 157 though 168 out of 400
Just received fits for points 169 though 180 out of 400
Just received fits for points 181 though 192 out of 400
Just received fits for points 193 though 204 out of 400
Just received fits for points 205 though 216 out of 400
Just received fits for points 217 though 228 out of 400
Just received fits for points 229 though 240 out of 400
Just received fits for points 241 though 252 out of 400
Just received fits for points 253 though 264 out of 400
Just received fits for points 265 though 276 out of 400
Just received fits for points 277 though 288 out of 400
Just received fits for points 289 though 300 out of 400
Just received fits for points 301 though 312 out of 400
Just received fits for points 313 though 324 out of 400
Just received fits for points 325 though 336 out of 400
Just received fits for points 337 though 348 out of 400
Just received fits for points 349 though 360 out of 400
Just received fits for points 361 though 372 out of 400
Just received fits for points 373 though 384 out of 400
Just received fits for points 385 though 396 out of 400
Just received fits for points 397 though 400 out of 400
time for kernel regression example using 20000 data points on 33 nodes: 7.683309
time for kernel density example using 20000 data points on 33 nodes: 7.023251
time for bivariate kernel density example using 20000 data points on
33 nodes: 0.111617
address@hidden:~$

My point is that pelicanhpc provides the MPI environment ready to use,
or it provides a model to use if you want to set up your own cluster.
Examples like kernel_example.m show how to use openmpi_ext. I
certainly think that working on the other methods is a good idea, but
I want to point out that it's not too hard to use openmpi_ext.

Cheers,
Michael


reply via email to

[Prev in Thread] Current Thread [Next in Thread]