l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: task-server hacking


From: Marcus Brinkmann
Subject: Re: task-server hacking
Date: Thu, 22 May 2003 20:26:13 +0200
User-agent: Mutt/1.5.3i

On Thu, May 22, 2003 at 07:51:07PM +0200, Niels Möller wrote:
> If it's for one subsystem only, then it can get that subsystem from it's
> own thread id, and then it should refuse to respond to messages from
> other subsystems.

Right.

> (Bootstrap is still fuzzy, of the bootstrap starts a
> few tasks in the hurd subsystem, and one of them is the task server,
> then the task server needs to know in what range the thread numbers of
> the other tasks lie. Have you thought about that already?

That's what the boot script parsing is for.  We already do such things for
inserted port names into ext2fs.static etc.

> About thread references, it striked me that perhaps the reference
> relation is reflexive. I.e. thread A refers to thread B iff thread B
> refers to thread A. If that's the case, we halve the size of the
> reference bitmap ;-)

thread references?  There are no thread references.  If you talk about task
ID references, then I am not willing to just take random implementation
possibilities and try to refit the design of the IPC system around that. 
Sorry, but if you think that is important you have to do this work yourself.
I am still not sure that we have indeed covered all possible nasty
situations that can occur when passing normal handles.  If you now talk
about different sort of half-handles and light-weight-non-handles, this is
only bound to increase the problems rather than reduce them.

I guess I made the start by suggesting that task ID references are not real
handles but just bits in the bitmask.  I have now changed my mind, and it
was the comment that death notifications are usually installed and my
average calculation that made me do it.  Plus some nagging doubts about not
using real handles in various corner cases.

> I've started to think about notifications. I think there should be two
> datastructures, one is the refer-to relation, which can be thought of
> as a square n times n bitmap, if n is the maximum number of tasks (it
> might be implemented differently). There is also a list of pending
> notifications, which can also be thought of as a bit matrix (although
> it will be even sparser, so more likely to be implemented as a list)

Ouch, what did I start? :)  I think that the actual implementation is the
very last thing to be really concerned about.

> So, how are notifications sent? There should be a call a task can use
> to get a (randomly selected) death notification. There should also be
> a separate thread in the task server, that can operate as follows:
> 
> Form the list of pending notifications, for which the receiver thread
> wants the server to send an rpc (a task can have a reference without
> having requested explicit notifications). Send an rpc with zero
> timeout to each but one of the receivers. To the final one, send the
> message with a pretty large timeout, say one second or so. Which
> receiver that gets the large timeout should change for each run over
> the list.
> 
> Whenever a task dies, so that entries are added to the pending list,
> the notification thread should be woken up (don't remember, but the
> other thread should interrupt the blocking ipc send).
> 
> I haven't thought much about locking yet (there's currently only a
> single thread, and no notifications), it's too early to say what lock
> granularity is needed, and I think I need some advise to get that right.

There is no hard reason to single out one task that gets a big chance to get
the notification.  Instead, when a task dies, zero time out notifications
can be send out immediately, no other thread needed.

The problem is what should happen if a task does not get a notification.  To
ensure that it will always get it, it would need one thread per notification
it requests.  But that is absurd.  More than one thread for such
notifications is not reasonable.  So what if many tasks die at the same
time?  Then the client could still process the last notification when the
next one is attempted to be sent.

The task server has an interest in getting the notifications to be
processed, because a lost notification means that a task ID is not reused.
So, it might need to retry after a couple of seconds or so.  However, it
should not retry forever, I guess.  In this special situation, I think it is
reasonable for the task server to kill a task if it doesn't respond to death
notification messages for a long time, or if it holds on to task IDs for a
long time (several seconds or even minutes each).  This is reasonable
because usually processing such notifications should only take a few
milliseconds.  The alternative is to drop the messages and let the system
administrator kill such hanging tasks manually.  Maybe it should be
configurable.

When I said I don't know how notifications should work I meant that I am
unsure about what should happen if they don't work.  This is different for
different types of notifications (for example in the console, each
notification has a sequence number, and if notifications are lost, the
client can notice and refresh the screen).

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' GNU      http://www.gnu.org    address@hidden
Marcus Brinkmann              The Hurd http://www.gnu.org/software/hurd/
address@hidden
http://www.marcus-brinkmann.de/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]