I'm running into an abort while trying to perform a graceful shutdown and am looking for some guidance for what to look for.
Environment:
Debian 9 (stretch), libmicrohttpd 0.9.62
Daemon Options
MHD_USE_DUAL_STACK
MHD_USE_IPv6
MHD_USE_ITC
MHD_ALLOW_SUSPEND_RESUME
MHD_USE_EPOLL_INTERNAL_THREAD
What's Happening
Upon receiving a SIGINT our server tries to quiesce the daemon, and shutdown(2) the listen socket so no more new connections can be made. We then use a condition variable and a request count to block until a timeout (5 mins) or all requests are complete before stopping the daemon and closing the socket.
In some cases, still trying to isolate when, the call MHD_quiesce_daemon leads to an abort. The external logger callback provides this line "Fatal error in GNU libmicrohttpd daemon.c:4740: Failed to remove listen FD from epoll set".
I'm sure we are not calling quiesce multiple times: MHD_quiesce_daemon is wrapped with a mutex synchronized field to keep track if it has been called, and I've reviewed the shutdown call stack. As always I could be wrong.
Are there other conditions where a listen socket is removed from the epoll set? We suspend and resume requests constantly, could a failure to resume or multiple calls of suspend to a connection cause such behavior?
Any guidance for what to look for would be appreciated. I will be continuing to try to recreate in a more controlled environment with gdb.
Thanks for your time!
Damon