bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mach_msg blocking on call to vm_map


From: Richard Braun
Subject: Re: mach_msg blocking on call to vm_map
Date: Fri, 2 Sep 2016 00:38:24 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

On Thu, Sep 01, 2016 at 11:54:20AM -1000, Brent W. Baccala wrote:
> On Thu, Sep 1, 2016 at 10:28 AM, Richard Braun <rbraun@sceen.net> wrote:
> > I completely disagree.
> 
> Thank you, Richard.  Really!  Thank you for disagreeing.  Now we can have a
> good discussion about this!

This is how it works. People have opinions, express them with no
ambiguity (including the level of disagreement), and hopefully the
discussion and the results benefit from that.

> > Most modern microkernels use synchronous IPC
> > that just block, and many operations on top of them do block. The
> > overall system on the other hand must be careful to avoid such
> > deadlocks.
> 
> OK, I read the Mach documentation for mach_msg() and concluded that it was
> like a POSIX read(), that I could operate it in a mode where the kernel
> absolutely would not block my process, and would return EWOULDBLOCK
> instead.  That's basically a kernel guarantee, at least as much as it is.
> (Notice that it doesn't guarantee how long the system call will take - 1
> ms?  1 s?  1 week? - because it's not a real time system, which is why I
> say "as much as it is")

Yes, you can think of mach_msg as such a system call. Note that if
the timeout is 0, it will return immediately instead of blocking,
which is what a real-time system would do too. Real time systems
aren't about that at all.

> Are you now saying that's not how it works on Mach/Hurd?  If so, please let
> me know, because I've been under a big misunderstanding that I need to get
> cleared up!

I think your mistake here is using MACH_SEND_TIMEOUT instead of
MACH_RCV_TIMEOUT. Your message certainly was sent, so there is no
reason to get a timeout error for that.

> Can a bunch of screwy translators legitimately cause mach_msg() to block
> for some user space thing that might never happen, even if I've supplied
> MACH_SEND_TIMEOUT?

No, but I personally don't think it would be a problem if they could.

> Yes, but libpager is in user space.  Isn't one of the great selling points
> for Hurd is that we put so much stuff into user space, and the kernel
> offers us guarantees (read: "guarantees") that we're protected from
> misbehaving stuff in user space?

Does the kernel protect your web browser from a server stealing your
data, corrupting them, or slowing you down on purpose ? The kernel
is merely a messenger. My position is that, when you contact a server
you implicitely assign it a level of trust that cannot be measured
by the kernel. The Hurd merely allows the client to detach from the
server, which I don't think is a very useful thing in practice.

So no, the kernel definitely does _not_ protect users from misbehaving
stuff in userspace. On the contrary, since we allow any user to
plug their own servers on the file system, it actually becomes
less secure. This was famously shown with the example of the
firmlink translator used in /tmp, which would cause the removal of
any file targeted by the firmlink on /tmp cleanup during system
startup.

Again, this is personal, and becoming off-topic, but I think the Hurd
should not allow communication with untrusted servers whatsoever,
and we should build a user-friendly mechanism from which the code
could determine whether to trust a server or not.

The current state is that this is indeed still a problem that needs
to be solved. It doesn't necessarily mean that we cannot make it secure.

> > My personal opinion on the matter is that you should only invoke
> > remote objects if you trust them.
> 
> How pervasive is this in the design?  Is vm_map only one of many RPCs that
> can block mach_msg() if some critical system translator is on the blink?

Now that you know you should be using MACH_RCV_TIMEOUT, you should see
that no server can block you indefinitely. But again, that's the only
guarantee you get from the "mutually untrusting" principle application
of the Hurd... Again, I don't think it's very useful in practice. It
can even lead to serious mistakes like the original implementation of
select, where the mach_msg timeout was used to implement the select
timeout, and if 0, implement non-blocking behaviour. Since mach_msg
couldn't possibly know when the server routine had completed a
single non-blocking action, it would always return a timeout error
prematurely. The select timeout simply had to be passed to the
server and implemented there.

> > The original Hurd design,
> > however, was explicitly meant to allow clients and servers to
> > be "mutually untrusting". I'm not exactly sure what this means
> > in practice but it seems that clients can detach themselves from
> > servers at any time. So making the timeout work, and allowing the
> > transaction to be interrupted (usually with a - hardcoded ? check
> > how glibc handles Ctrl-C/SIGINT during an RPC - 3 seconds grace
> > delay before brutal interruption) may be the only things required
> > to make the behaviour comply with "Hurdish expectations".
> >
> > Thank you for that clarification.  I've figured out that Ctrl-C is handled
> by a message.  Does glibc spawn a separate thread to handle those
> messages?  Is that why all of the processes on the system have at least two
> threads?  That 3 second timeout - what is it, exactly?  I'll have to look
> at the code, but this is something I've only partially puzzled out.

Yes, that second thread is the "message thread" and implements the
"message port", used by the proc server, ps and maybe other tools to
communicate with a process (see ps -M option, which allows us to
list processes without blocking when one of them becomes unresponsive
on its message port)

The 3s timeout I'm talking about is the delay between the initial
Ctrl-C (when an RPC is waiting for a reply, an interrupt message is
then sent to the server to request the RPC to be interrupted) and
the brutal disconnection from the server in case nothing happens.

See the INTR_INTERFACE in /usr/include/hurd/interrupt.defs.
See _hurd_interrupted_rpc_timeout in hurd/hurdsig.c in the glic
sources.

-- 
Richard Braun



reply via email to

[Prev in Thread] Current Thread [Next in Thread]