[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: qlimit

From: Marcus Brinkmann
Subject: Re: qlimit
Date: Mon, 24 Jun 2002 03:31:48 +0200
User-agent: Mutt/1.4i

On Wed, Jun 19, 2002 at 05:34:56PM -0400, Roland McGrath wrote:
> > There are only two ways to reduce the number of RPCs: Group multiple
> > rpcs into one or change the semantics.  The former makes it harder for
> > clients to display changes incrementally, the latter is only of limited
> > use.  I could for example have a normal screen matrix notification also
> > have the meaning: "check the current cursor position", this would half
> > the number of RPCs for the common case.
> The major kind of semantics change you could make would be to use less
> message passing and more shared memory.  This is one of the classic ways to
> reduce exactly this kind of overhead.

Ayup, thanks for pointing this out.  I did a bit of research, and found some
papers on non-blocking pipes using circular arrays in asynchronous
communication, mostly related to real-time systems and parallel programming. 
Although none of the algorithms I found matched our requirements, it was
helpful in getting a bigger picture.

For the console server, we only have a one way communication (one writer
distributes the same data to multiple clients), and we wouldn't trust any
client input anyway, so no cooperative model is appropriate.  This leaves
what you suggested:

> That is, the change notifications
> can be relatively rare (e.g. only one ever queued) and just serve to alert
> you to examine the shared memory area for updates.  If you want to have a
> full record for incremental updates, you can use a shared memory area that
> encodes a log in your own format.  That too is not fully synchronous in the
> sense of the client never losing an update, unless the log expands
> arbitrarily (which you could do).  But you could make the log a large ring
> buffer and easily have more than enough backlog space than you ever care to
> worry about.

This is what I did now (just committed the code).  I used the notify port
feature to not queue any messages anymore when the queue is full, and set
the qlimit to 1 in the client (larger numbers also work, but don't make much
sense).  I introduced a PENDING bit in the modification request structure on
the server side, to make sure that the client always will receive at least
one notification after the server changed something.  I added a ring buffer
of changes to the shared memory area, and a field indicating the number of
records written by the server.  One record is 8 bytes long, and the highest
bit in the first four bytes shows if this is a matrix change (start - end)
or a status change with a bit field indicating what changed (cursor position
etc).  The client has to use a read-and-check algorithm to extract the next
1 (N...) records.  The server is allowed to change only the next record
before updating the count of records written (so if "written" is 5, the
sixth record is unreliable).  All of this should be pretty standard.

The handling of the notification ports is a bit annoying.  I had to create a
class and bucket and donate a service thread to retrieve the kernels
responses.  I decided that one thread is enough, if this turns out to be
incorrect, we can still change it to manage_multithread.  However,
destruction of a display becomes more complicated now, as kernel msg
accepted notifications can reach us after we destroyed the display
structure.  To fix this I use the clean_operation of the notify_class to
free the display structure.  I think I cover all cases now but one, and that
is the following:

( Send
     If the destination port is
     destroyed before the notification is generated, then a send-once
     notification is generated instead.

I am not sure what the destination port is here, if it is the client port
we try to send to, we have a problem.  Because a send-once notification
doesn't let us know which clients port died, so we can not find the correct
filemod_req structure belonging to this request.  I think we either have to
use one port for each client, or maintain separate dead name notifications
for the client ports so we can free the filemod requests.  This is a corner
case, that can only happen if the client is malicious or crashes while we
try to send, so we can fix it at any convenient time.

Now, what did this give us?  The numbers show that the approach is working. 
Instead trying to send 85000 messages, we only send 100.  The client side
works fine, too, and actually has become a bit simpler.  The only parameter
we have to tune at the client side is ohw much "slack" we give it, eg, if it
sees that it lags behind by N records, it can make the decision to do a full
refresh rather than an incremental update.  If N is size of ring buffer,
the client doesn't have a choice, but it is actually nicer if the client
chooses a lower N.  I don't have an algorithm to find a good N dynamically

Performance wise, this solution doesn't seem to be better than the earlier
brute force approach with a send timeout.  Without clients, the server needs
30s to process the find command, with one client it needs 50s, with two
clients it needs 53s.  I don't know why it needs 20s more with one client
attached.  I tried to profile it, but I need to try again with the
libc0.3-prof package installed, because the difference isn't obvious in the
profiling data I gathered profiling just my program.  I don't really see why
sending 100 messages should cost 20 seconds.  With my new model, I would
expect the server to be almost as fast with clients attached as without.  I
would accept a constant penalty per client (eg, the three seconds between
one and two clients), but it seems there is a 17s penalty for having any
client.  I can not explain this yet.


`Rhubarb is no Egyptian god.' Debian address@hidden
Marcus Brinkmann              GNU    address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]