On 01/26/2015 08:57 PM, Tomas Heran wrote:
Hi,
thanks for taking the time look at and comment my proposal. Please see
my responses inline ...
Same here (also cut out parts that don't need a reply).
Christian Grothoff wrote:
Given the specifics of the program I'm using MHD from, I'm doing all
connection management myself and only handing MHD a client connections
once they are "ready to talk" HTTP on (well, I give it one connection
per daemon really).
So, I do roughly this:
1) MHD_start_daemon(... | MHD_USE_NO_LISTEN_SOCKET |
MHD_USE_STREAM_CONNS ..., MHD_OPTION_NOTIFY_COMPLETED ...)
2) MHD_add_stream_connection
3) Endless loop of MHD_run() which I break only if complete handler
gets called with non-zero argument. Maybe there's a better way how
to know when all clients disconnected?
You have a few choices, there is MHD_get_daemon_info with
MHD_DAEMON_INFO_CURRENT_CONNECTIONS which in your case should work; you
could also look at the set of sockets (or just "max") returned by
MHD_get_fdset(), especially as you use MHD without a listen socket.
Without my changing the 'client_aware', the complete handler gets called
with 0 once a request is handled (response is sent), but subsequent
client disconnect (which is correctly detected inside the library when
read(...) == 0) does not result in complete handler getting called at
all.
Does this make a little more sense?
Yes, but seems very expensive and ugly to me.
4) I understand you may not care, but how do you see the
proposed new API interact with HTTPS? What happens if
MHD runs with HTTPS and then your API is used? We have
to at least consider and document those cases.
In our environment/program this is handled at the level of the
connection "object" which is able to encapsulate an SSL/TLS secured
socket as well. But as far as MHD is concerned, it'll simply be using
the supplied the read/write function pointers and the reading/writing
functions, based on the type of the connection (secure or not) would
either use plain read()/write() or SSL_read()/SSL_write().
Ok, but that again shows that this way of doing things is not even
expected to interoperate with other MHD features: you can't use MHD
threading or thread pools or the event loops, you create a fresh MHD per
connection (yuck), and you don't work with TLS as provided by MHD (only
your own).
6) Your patch overall does not give a good answer for how
this should work with the various event loops; I overall
dislike extensions that will only work in special cases
(i.e. only with external select, etc.). I don't see
how you plan to make that work nicely.
Well, I "ducked" all this by introducing MHD_USE_STREAM_CONNS (sucky
name, I know) flag - when the MHD daemon is started using this, it
assumes it all connections are of this type and will only run
MHD_poll_stream_conns().
I assumed one would only ever want to use one type or another inside
one instance of the MHD daemon.
Again, that makes it a rather serious limitation of the entire approach.
8) Why do you change the worker pool size from an
'unsigned int' to an 'unsigned long long'? I think
2^32 threads will still do for quite some time...
Agreed, more than quite some time ;) I believe I made this change based
on a warning generated by lint (on Solaris) ... looking at the code now,
the variable is at one point compared with SIZE_MAX which is 2^64 on
LP64, which was likely the place where the warning was issued. I'll
double check that. But size of the thread pool was never a reason for
me changing this.
We do:
if (daemon->worker_pool_size>= (SIZE_MAX / sizeof (struct MHD_Daemon)))
but this is to avoid a (theoretical) integer overflow issue on 32-bit
systems if the worker-pool-size is very large (say ~ 2^24). Regardless,
the code is correct, and lint is not a tool to be blindly followed.
Most importantly, have you really considered _alternatives_? Like say
just connecting to loopback and feeding the data into the MHD socket --
say from another thread or a co-routine? I think most of what you try to
do can be done at the expense of going one more time though TCP/IP (or
at least a Unix Domain Socket). Still, for most cases that should be
totally sufficient, and would avoid bloating the API _and_ answer the
questions of how this is to work with the various event loops.
Actually, a few months back, I did actually try to implement something
like that - precisely because I wanted to avoid introducing those
"object" connections. I had a prototype which almost worked but never
finished it because I really didn't like the complexity (more threads,
unix domain socket, connecting to "myself" etc.) - my concerns were not
as much performance related as they were about the complexity of the
code.
You don't need more threads, and you don't actually need a unix domain
socket either. You can use "socketpair()" for the bidirectional channel,
and you can use epoll() or whatever event loop mechanism you prefer to
handle your socketpair() processing -- either in one big event loop that
combines MHD and your polling, or using a 2nd thread (or thread pool)
for the socketpairs(). While more costly in system calls to the use of
socketpair(), this will almost certainly be more scalable than your
solution of thread+MHD-daemon per connection.
And compared to the proposed non-orthogonal complex changes to MHD, I
would suspect that if done right, this could be very clean.
I believe that even when I'll address the poll-ing, then introducing
these object encapsulated connections addition will end up being much
simpler (as in more straightforward) solution than the threaded
looping-back socket "pump" that you propose and I gave a shot and
didn't like ;)
That said, I absolutely understand your concerns about introducing this
to the MHD and bloating the API.
That is one concern, but I furthermore also cannot imagine that it would
lead to a good solution even for your project. The number of hacks are
just a bit too high for that ;-).
I also understand that the way I need to use MHD is quite specific -
basically, I'm not using *any* of the connection management in there
(listening for new connections, handling of the client connections
etc.) and in fact, all I need is an HTTP engine (transport agnostic)
and MHD was the closest (I reviewed a few similar libraries - mostly
the ones you're listing on your website) that I could get this
functionality from with just "a few" changes.
Anyway, I hope this is shedding a little more light on what my goals
and intentions are.
Yes, I think I get those now, and based on my current understanding I
would encourage you to give socketpair() -- maybe in combination with a
single thread (pool) handling the non-MHD side of _multiple_
socketpair()s another try.
It will probably take a lot more work and convincing arguments for me to
consider integrating a patch in this direction, as so far my conclusion
is that this would just be wrong. Especially as what seem to me to be
cleaner alternatives are already enabled by the current API.