octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Octave Interpreter


From: Stefan Seefeld
Subject: Re: Octave Interpreter
Date: Mon, 06 Oct 2014 23:12:59 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1

On 2014-10-06 22:34, Bipin Mathew wrote:
> Hey Guys,
>
>    I have a few questions about MPI and peoples ideas of the scope of
> this project more generally.
>
> 1. ) Is it possible, using MPI, to spawn processes on specific
> servers? The tutorials I have seen on line appear to indicate that you
> can specify the number of processes to spawn but not precisely where
> they will be spawned ( aside from the mpd.hosts file ), can this be
> set programatically?

Technically, MPI doesn't itself support the spawning of processes.
Rather, a separate tool (typically called 'mpirun') is used to start N
instances of a given application, which then use the MPI API to connect
to each other and process data in parallel. Doing this spawning
programmatically from within a running "frontend" process would be very
useful, I believe, in particular in scripting contexts such as Octave or
Python, but that would have to happen outside the realm of MPI.

>
> 2.) How is MPI's error handling and failure resilience? If a node
> fails mid-computation what happens? Or even pre-computation, is there
> an online way of knowing which nodes are available, or will MPI just
> not even attempt to launch on servers it perceives as down?

Again, as MPI itself isn't doing the launching, checking node status
upfront is outside its scope. Also, given all the assumptions about
symmetry that are made as part of the SPMD (single program multiple
data) execution model, I wouldn't characterize this as failure
resilient. However, individual implementations may provide some means to
recover from errors, which however are fully outside the scope of MPI
itself. In other words, this is a Quality of Service issue, not
something inherently part of the MPI protocol. For example:
https://www.open-mpi.org/faq/?category=ft

>
> 3.) Are we expecting to support persistence of distributed objects?
> This is important since "Big Data" is seldom ephemeral. People want to
> load their tables / data-cubes once. Of course we should also support
> a mechanism for temporary distributed objects for constructs like
> ifft(fft(X)) and for ad-hoc analysis / prototyping. Therefore I vote
> yes, that we should support persistent distributed objects, but just
> wanted to get other peoples views. This also motivates my thought that
> slave processes should launch as close to the data as possible a la.
> Hadoop.

I think at least initially we should consider persistence and
distribution orthogonal. With pMATLAB (as well as OpenVSIP), a
distributed array can be accessed locally in terms of its local
sub-array (with appropriate mapping between local and global indices),
which can each use local I/O for persistence.


>
> 4.) What are peoples opinions on other transport layer technologies
> like Google's protocol buffers or Thrift?

I don't know those, but would really hope that - if we can manage to
define distributed arrays - we may well be able to fully abstract away
the underlying transport, so the specific choice becomes less important.
Users shouldn't have to deal with MPI or similar, they just use regular
access operators, and all required data movement is done behind the
scene (e.g. using the owner-computes rule).

For that reason, I suggest we pick an existing protocol that requires
minimal work to get off the ground (such as MPI), then focus on the
high-level API and semantics for distributed arrays. Once that's
established, other "backends" could be added if that turns out to be useful.


    Stefan


-- 

      ...ich hab' noch einen Koffer in Berlin...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]