libmicrohttpd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[libmicrohttpd] race in libmicrohttpd?


From: Eivind Sarto
Subject: [libmicrohttpd] race in libmicrohttpd?
Date: Mon, 20 Jun 2011 19:27:12 -0400

I am using libmicrohttpd for a project serving "Apple HLS live streaming" 
content to tablets (like iPad, Android, etc).
And I think I am hitting some kind of race in the library when the request load 
is high.
I am using MHD_USE_THREAD_PER_CONNECTION and MHD_USE_POLL.

I have written an Apple HLS client simulator and can easily fire up what looks 
like 1000s of iPads.  Not sure if you are familiar with
the Apple HLS protocol, but each client requests a set of playlists that 
describe multiple bitrate streams that are played as 10 second clips.
Each clip looks like a progressive download.  If the client decides that the 
current bitrate is too low or high, it can jump to higher/lower of that
strip at the next 10 second interval.

Anyway, when I attempt to start a 1000  iPads, the library starts logging 
errors like:
    Failed to receive data: Socket operation on non-socket
and the clients start to fail.

This only happens when I use sendfile to serve the playlists, but I don't think 
it is a problem with sendfile itself.  It is just that the
timing of the requests change and I am not hitting the race.

I dug around the source of the library and I found the following code:

void
MHD_connection_close (struct MHD_Connection *connection,
                      enum MHD_RequestTerminationCode termination_code)
{
  SHUTDOWN (connection->socket_fd, SHUT_RDWR);
  CLOSE (connection->socket_fd);
  connection->socket_fd = -1;
  connection->state = MHD_CONNECTION_CLOSED;
  if ( (NULL != connection->daemon->notify_completed) &&
       (MHD_YES == connection->client_aware) )
    connection->daemon->notify_completed (connection->daemon->
                                          notify_completed_cls, connection,
                                          &connection->client_context,
                                          termination_code);
  connection->client_aware = MHD_NO;
}

This smells like a race to me.  The client sets 'connection->socket_fd = -1', 
at which point the daemon process could clean up the connection.
However, the connection structure is still being referenced by the connection 
thread.

I moved the first 4 lines of this function after the notify_completed function.
I still see similar errors, but it is now much more difficult to reproduce.
Before the change, I would see errors after around 1000 iPads.  Now I need 5000 
to make it fail.
But, maybe there is more than one race?

-eivind  


reply via email to

[Prev in Thread] Current Thread [Next in Thread]