Re: Hurd on a cluster computer

bug-hurd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hurd on a cluster computer

From:	Richard Braun
Subject:	Re: Hurd on a cluster computer
Date:	Wed, 27 Jul 2016 16:34:06 +0200
User-agent:	Mutt/1.5.23 (2014-03-12)

On Tue, Jul 26, 2016 at 01:42:16PM -1000, Brent W. Baccala wrote:
> Can Hurd work, well, in such an environment?

No, it was not designed for this kind of usage, although anything can
be done with enough time and work.

> First, it's basically Mach that would have to be modified, right?  Changes
> to Hurd servers might be required for performance reasons, but so long as
> Mach works on the cluster, Hurd should work.
> 
> Next, Mach/Hurd's memory limitations and 32-bit pointers.  My first through
> was to ignore it for right now, since these are well known problems.  If we
> could get Hurd running at all on a cluster computer, then we've have to
> come back and make sure it can actually use the entire 8 GB of RAM on a
> single node.  Yet I'm not sure.  There might be situations where we have to
> address the entire cluster's RAM, even though accessing a non-local part of
> it will be slow.
> 
> Sending large blocks of data in Mach messages becomes problematic, since we
> can't play shared memory games.  It would have to be emulated, and avoided
> whenever possible.  These are the kinds of changes that would be needed to
> the Hurd servers themselves - they can no longer assume that firing virtual
> memory across a port is fast.
> 
> In-order and guaranteed delivery.  For the moment, let's assume that our
> LAN can do this natively.  Since we're not going through routers, only a
> single Ethernet switch, maybe virtualized, this might work.
> 
> Can a Hurd network driver be built to pass kernel messages, or is this a
> huge problem?  Something like, you load an Ethernet driver, and it has some
> kind of interface that allows Mach messages to be passed through it?
> 
> Protected data, like port rights - let's assume that we use a dedicated
> Ethertype that isn't routed and can't be addressed by anything but trusted
> Mach kernels.  Yes, this means that our Ethernet driver now becomes a
> potential security hole that can be used to steal port rights, but let's
> keep noting and then ignoring stuff like that...
> 
> And, oh yes, a "Mach kernel" is now something that runs across multiple
> processors with no shared memory.  This is the biggest problem that I can
> see - Mach is multithreaded, so that's not a problem, but I'll bet it
> assumes shared memory structures between the threads, and that's pervasive
> in all its code.  Am I right?

Mach was actually designed to build heterogeneous computer clusters,
which is why messages have typed data, and applications communicate
with the kernel(s) through "host" ports (see the host interface in the
GNU Mach reference manual).

There used to be a network proxy to pass data between hosts, copying
when shared memory isn't available, and this proxy would masquerade
remote ports locally.

> If so, then the first step would be to modify Mach, probably throughout its
> code, so that it can handle threads with no shared memory between them,
> only a communication interface provided by a network driver.  That gets it
> running on a cluster, then we need to remove the memory limitations, and
> start tuning things to make it run well.

No, all this isn't necessary and would break common current POSIX
assumptions.

> The payoff is a supercomputer operating system that presents an entire
> cluster as a single POSIX system with hundreds of processors, terabytes of
> RAM, and petabytes of disk space.
> 
> Any thoughts?

Most attempts in the past have failed. It seems better to build
specialiazed cluster computers on top of local operating systems.
Look for "single system image" on a search engine for projects
with this goal.

Also, look at QNX, which is probably the closest example to what
would be relatively easy to achieve on the Hurd. It doesn't provide
a complete SSI system but allows mostly transparent access to remote
resources, like Plan9 and other similar systems would, in a way
that is much closer to how the Hurd on top of Mach would do.

-- 
Richard Braun

[Prev in Thread]

Current Thread

[Next in Thread]

Hurd on a cluster computer, Brent W. Baccala, 2016/07/26
- Re: Hurd on a cluster computer, Richard Braun <=
  - Re: Hurd on a cluster computer, Brent W. Baccala, 2016/07/30
    - Re: Hurd on a cluster computer, Richard Braun, 2016/07/30

Prev by Date: rump pci drivers backend
Next by Date: Re: open-isns porting question: sudden SIGLOST
Previous by thread: Hurd on a cluster computer
Next by thread: Re: Hurd on a cluster computer
Index(es):
- Date
- Thread