[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The problems for the rootless subhurd

From: Da Zheng
Subject: Re: The problems for the rootless subhurd
Date: Tue, 09 Jun 2009 14:25:53 +0800
User-agent: Thunderbird (Macintosh/20090302)

olafBuddenhagen@gmx.net wrote:
In order to track all  tasks in subhurd, boot works as a proxy for all
RPCs on the task port,
However, it seems to be the source of the most serious  bug in my
modified boot.

BUG: After I added the proxy for all RPCs to 'boot', I find that
subhurd  sometimes failed to boot. For example, it sometimes stops
booting after  the system displays "GNU 0.3 (hurd) (console)" and it
sometimes boots  successfully and displays "login>" but stops working
after I try to  login. Sometimes, it even prints the error message

   getty[47]: /bin/login: No such file or directory Use `login USER'
   to login, or `help' for more information.

Of course, sometimes subhurd can boot and I can login successfully.

Sounds like some kind of race condition... But I don't know where.

You could try tracing all RPCs made to the proxy (using some logging
mechanism in the proxy itself, or perhaps rpctrace), and comparing the
results of various runs...
As I mentioned before, the subhurd sometimes hangs. I think I have found one of the places where subhurd hangs.

The boot now proxies all RPCs that are sent on the task port.
The proxy works in a signal thread and it only forwards the requests of most RPCs and their replies are sent back to subhurd by the kernel. But task_create, vm_set_default_memory_manager, processor_set_tasks and host_processor_set_priv are handled by the proxy and their replies are sent back directly. One place where subhurd hangs is when the exec server calls vm_map at some point. The proxy fails to forward the request of vm_map and mach_msg is blocked.
The code of forwarding messages is as follows:

debug ("request %d to %d, real target: %d", inp->msgh_id, target, task_pi->task_port);
 /* Resend the message to the tracee.  */
 err = mach_msg (inp, MACH_SEND_MSG | MACH_SEND_TIMEOUT, inp->msgh_size, 0,
 outp->RetCode = MIG_NO_REPLY;
 if (err)
     info ("mach_msg %d to %d: %s", inp->msgh_id, target, strerror (err));
     debug ("mach_msg %d to %d: %s", inp->msgh_id, target, strerror (err));
     outp->RetCode = err;

 debug ("request %d to %d ends", inp->msgh_id, target);

I have enabled send timeout and the time to wait before giving up is 0 (I tried some other values, and it didn't seem to work, either). I don't understand why mach_msg is still blocked even when the send timeout is enabled? It is also weird that subhurd hangs only by vm_map called by the exec server (though I sometimes see the subhurd hang by something else, which is definitely not the RPCs forwarded by boot).

I am thinking if it has something to do with the memory management. e.g., some memory is swapped out, but it cannot be read from the disk. But it should not be possible because the subhurd doesn't have its own default memory manager and doesn't have its own swap partitions.

Could anyone have any clues why mach_msg is blocked here?

Thank you,
Zheng Da

reply via email to

[Prev in Thread] Current Thread [Next in Thread]