monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Performance improvement splitout


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Performance improvement splitout
Date: Sun, 17 Sep 2006 18:28:23 -0700
User-agent: Mutt/1.5.13 (2006-08-11)

On Fri, Sep 08, 2006 at 11:15:03AM -0700, Eric Anderson wrote:
> Nathaniel Smith writes:
>  > Hmm, I don't get it -- if the socket is readable (and presumably it
>  > is, if we're getting bytes when we call recv), shouldn't select be
>  > returning instantly anyway?  Your description seems to indicate that
>  > it's actually blocking...
> 
> it's not blocking, it's just waiting an extended period of time.  If
> you run strace in the -T mode to print time spent in syscalls, I
> discovered that even when calling with a timeout of 1us, it would
> still wait for a few milliseconds.  It didn't happen all of the time,
> and the call was only checking for error conditions, so whether or not
> data was present to be read was irrelevant.  
> 
> Stuff like this:
> % strace -e select -T mtn -d /tmp/foo.db pull usi.hpl.hp.com \*     
> select(8, [], [], [7], {0, 1})          = 0 (Timeout) <0.004812>
> select(8, [], [], [7], {0, 1})          = 0 (Timeout) <0.004714>
> select(8, [], [], [7], {0, 1})          = 0 (Timeout) <0.005282>
> ...
> 
> not
> select(8, [7], NULL, NULL, {21600, 0})  = 1 (in [7], left {21600, 0}) 
> <0.000010>
> 
> >From what I recall and the patch, it was this part:
>       probe.clear();
>       probe.add(*(sess.str), sess.which_events());
>       Netxx::Probe::result_type res = probe.ready(armed ? instant : timeout);
> 
> in the armed == true state.
> 
> Waiting ~5ms on average was causing CPU utilization to hover around
> 80%, even though the system as a whole was capable of 100%.  As I said
> earlier, for whatever reason, when talking over the loopback device on
> one machine, the same behavior was not observed.

Hmm, so I just read this again, and now I have a hypothesis!

It's well known that requesting a wakeup in 1us isn't going to work;
unless you have some super-fancy realtime patches in your kernel, the
smallest amount of time you can sleep for is 1/HZ in theory, or
generally a few milliseconds in practice.  I also see now that these
slow select calls are timing out; before I thought that they were
reporting that IO was available (since from your description, you said
that the fd's were readable -- I should have paid more attention).  We
see here that select _can_ return very quickly when there is IO
available, and I thought that that was what it was doing.

So my theory is that the kernel sees that we have a non-zero timeout
value (nevermind that it's very small), and so we get put to sleep,
and then woken up again "as soon as possible", which is a few
milliseconds later.  If, instead, we actually passed a _zero_ timeout
value, then we would not yield our time slice at all, and we would see
select returning in 0.00001 seconds instead of 0.005.  (This _might_
also makes some sense of the localhost behavior.  IIRC the linux
loopback device is notorious for working differently from other
devices; perhaps it does not put select() callers to sleep for very
small timeouts like this.)

The reason we pass a very-small-but-nonzero timeout to select is to
work around one of the more broken bits of netxx's design -- it uses
the 0 timeout as a sentinel to mean "no timeout at all".  I tried
fixing this once, long ago, but it quickly turned into a horrible
morass.  A pragmatic workaround might be, hack the netxx select code
so if the Netxx::Timeout passed in is very small, truncate it to zero
before passing to select...

I also notice, on re-reading, that we are repeatedly calling select
with nothing in the read or write fds, and with a "zero" timeout.  Why
the heck would we be doing that?  It's basically a noop by
definition...

-- Nathaniel

-- 
"But suppose I am not willing to claim that.  For in fact pianos
are heavy, and very few persons can carry a piano all by themselves."




reply via email to

[Prev in Thread] Current Thread [Next in Thread]