libmicrohttpd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [libmicrohttpd] Missing Feature: Custom HTTP decoding


From: Christian Grothoff
Subject: Re: [libmicrohttpd] Missing Feature: Custom HTTP decoding
Date: Wed, 1 Sep 2010 14:04:37 +0200
User-agent: KMail/1.13.3 (Linux/2.6.32-trunk-vserver-amd64; KDE/4.4.4; x86_64; ; )

Hi!

I've added a new option MHD_OPTION_UNESCAPE_CALLBACK in SVN 12790.  Note that 
this callback does *not* allow conversion to UCS-2 (in fact, the documentation 
specifically says that it should convert to UTF-8 -- code might use strlen on 
it!) and requires the output of the unescape function to be equal or shorter 
in length to the input.  

I don't see UCS-2 as a feature that would be required in practice, and in the 
rare cases that might need it, the application can still do the conversion to 
UCS-2 from UTF-8 at a later time easily.  

Feedback on the new API will be particularly welcome *before* I release 0.9.1 
(which, as usual, may happen anytime).

Happy hacking,

Christian

On Monday 23 August 2010 14:34:55 Gerrit Telkamp wrote:
> >> A good solution might be to support a custom-specific character decoder,
> >> that is called by libmicrohttpd instead of its internal
> >> MHD_http_unescape(). This custom-specific decoder should be provided as
> >> a call-back method by the user. If it is not defined, the internal
> >> MHD_http_unescape() will be used.
> >> 
> >> The decoder should get a pointer to the input butter, and might return a
> >> new pointer to the output buffer. This would be useful if the decoded
> >> characters in the output buffer need more memory space than what is
> >> available in the input buffer.
> > 
> > I think your argument for a custom decoder makes sense; however, I'm not
> > sure about allowing the custom decoder to return a new pointer.  That
> > would require it to do memory allocation (at least as an option, and if
> > it is optional the interface will be even messier), and then we'd have
> > to handle failures of that (yuck).
> > 
> > This is especially critical given that your points do not seem to justify
> > any need for an unescape function to return a string that is longer than
> > the original input.  If there is such a case, please describe it.
> 
> You are right, when we fix the charcter encoding to UTF-8, we are sure
> that the size of a decoded buffer will never exceed the size of the
> unescaped buffer. But an application using libmicrohttpd might need the
> HTTP stream encoded as UCS-2 (2 bytes per character).
> 
> Another option might be to allocate the buffer for the decoded stream at
> startup, in case it is needed. So the input buffer for the raw HTTP
> stream (containing escapes) might be 32k, and the output buffer used to
> store he the decoded stream ("unescaped") might be set to a user-defined
> size, lets say 64k. If this buffer is more than 0k, it is used for the
> unescaped stream; if it is 0k, the existing buffer is used.
> 
> Best regards
> 
> Gerrit.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]