[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [libmicrohttpd] Problems with latency and getting "stuck"
From: |
maurice barnum |
Subject: |
Re: [libmicrohttpd] Problems with latency and getting "stuck" |
Date: |
Mon, 07 Apr 2014 14:05:44 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 |
On 04/05/2014 11:21 PM, Christian Grothoff wrote:
On 04/05/14 02:03, maurice barnum wrote:
Hi.
I'm working on a project where I want to use libmicrohttpd to handle
incoming HTTP connections that are then forwarded to a ZeroMQ server.
It would be helpful if you would mention which MHD version you are using.
0.9.34
Three problems I'm encountering:
* latency is bad due to a significant delay between when I wake up
a connection with MHD_resume_connection and when MHD_run calls the
corresponding handler callback
What do you consider a 'significant delay' here? In your data plot, it
seems you're concerned with each millisecond or a few microseconds, but
maybe I missed something.
* each run of MHD_run will accept several incoming connetions but
only "retire" one of the connections I've resumed.
"retire" or "resume"? I am pretty sure MHD_run is perfectly happy to
tear down multiple connections during one invocation. And yes, as
MHD_run may accept fresh connections before processing 'resume' events,
this may increase latency, especially given that you're using a single
thread for all processing. If you are concerned with
milli/micro-seconds, I wonder why you are not using a thread pool.
I have separate experiments with using a thread pool that make blocking
calls to my backend v. queuing the request and suspending the
connection thread. I may return to that approach when I understand the
issues with my current one.
I was surprised to see that MHD_run(), in my traces, never called my
callback for more than one resumed connection. For example, following
one zmq_poll, I resume 16 connections ("5" in the output is right after
MHD_resume_connection), but the next call to MHD_run results in only a
single callback ("6" in the output is after calling MHD_queue_response)
resumed connection is that pattern I've seen:
1396654686212937 -> ZMQ_POLL -1 20 | | | | | | | | | | | | | | | | | | | |
* 1396654686215761 <- 1 | | | | | | | | | | | | | | | | | | | |
1396654686215766 0x2470f00 | | | | | | | | 5 | | | | | | | | | | |
1396654686215768 0x2470e60 | | | | | | | | | 5 | | | | | | | | | |
1396654686215769 0x2470d20 | | | | 5 | | | | | | | | | | | | | | |
1396654686215769 0x2553280 | | | | | | | | | | | | 5 | | | | | | |
1396654686215770 0x2470b40 | | | | | | | 5 | | | | | | | | | | | |
1396654686215771 0x2553140 | | | | | | | | | | | | | | 5 | | | | |
1396654686215771 0x2470aa0 | 5 | | | | | | | | | | | | | | | | | |
1396654686215772 0x25530a0 | | | | | | | | | | | | | | | 5 | | | |
1396654686215773 0x25531e0 | | | | | | | | | | | | | 5 | | | | | |
1396654686215773 0x2470be0 | | | | | | 5 | | | | | | | | | | | | |
1396654686215774 0x2470a00 | | 5 | | | | | | | | | | | | | | | | |
1396654686215775 0x2470dc0 | | | | | | | | | | 5 | | | | | | | | |
1396654686215776 0x2553320 | | | | | | | | | | | 5 | | | | | | | |
1396654686215776 0x23f5140 5 | | | | | | | | | | | | | | | | | | |
1396654686215777 0x2470c80 | | | | | 5 | | | | | | | | | | | | | |
1396654686215778 0x2470960 | | | 5 | | | | | | | | | | | | | | | |
1396654686215779 -> MHD_RUN 20 | | | | | | | | | | | | | | | | | | | |
1396654686215893 0x25530a0 | | | | | | | | | | | | | | | 6 | | | |
1396654686215961 -> ZMQ_POLL 6 20 | | | | | | | | | | | | | | | | | | | |
1396654686215981 <- 1 | | | | | | | | | | | | | | | | | | | |
1396654686215982 -> MHD_RUN 20 | | | | | | | | | | | | | | | | | | | |
* 1396654686216009 0x25530a0 | | | | | | | | | | | | | | | X | | | |
1396654686216050 0x2553140 | | | | | | | | | | | | | | 6 | | | |
1396654686216060 -> ZMQ_POLL 6 19 | | | | | | | | | | | | | | | | | | |
note: the trace doesn't show the return from MHD_RUN, which is
immediately before each "-> ZMQ_POLL" line. I'll debug this to try and
understand what is happening.
* eventually, everything stops: I call zmq_poll with a timeout of -1
(MHD_get_timeout() returned 0), but the epoll fd never signals read-in
even when new connections come in.
You have set a connection limit of 25. If you suspended 25 connections,
the accept FD (and all 25 suspended connections) will be out of the
epoll set, and your server grinds to a halt until you resume a connection.
That makes sense, but I observe the server grinds to a halt after all
connections have been resumed and deleted (the "X" in the trace is from
the MHD_OPTION_NOTIFY_COMPLETED callback).
My event loop listens on a ZeroMQ socket and the epoll fd returned from
MHD. The loop looks basically like (pseudo-code):
while true:
MHD_run(md)
timeout = MHD_get_timeout() / 1000
if not timeout:
timeout = -1
zmq_poll(items, len(items), timeout)
I hope you're somehow having the MHD epoll socket in the zmq_poll set here.
Yes, using MHD_get_daemon_info().
...
Any ideas on where to look/debug is appreciated. I've attached a
digested trace of my debug output that shows the behavior.
I'd try with a connection limit of 1 first, that should simplify what
happens, and you should encounter certain problems immediately instead
of only after 25 'suspended' connections.
Happy hacking!
That sounds like a good idea. Thanks!