emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP


From: Thomas Fitzsimmons
Subject: Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
Date: Sun, 13 Mar 2016 15:54:34 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)

Eli Zaretskii <address@hidden> writes:

>> From: Thomas Fitzsimmons <address@hidden>
>> Cc: address@hidden,  address@hidden
>> Date: Sun, 13 Mar 2016 13:57:32 -0400
>> 
>>    (defun soap-parse-server-response ()
>>      "Error-check and parse the XML contents of the current buffer."
>>      (let ((mime-part (mm-dissect-buffer t t)))
>>        (unless mime-part
>>          (error "Failed to decode response from server"))
>>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
>>          (error "Server response is not an XML document"))
>>        (with-temp-buffer
>>          (mm-insert-part mime-part)
>>          (prog1
>>              (car (xml-parse-region (point-min) (point-max)))
>>            (kill-buffer)
>>            (mm-destroy-part mime-part)))))
>> 
>> mm-insert-part does:
>> 
>>    (string-to-multibyte (mm-get-part handle no-cache))
>
> Why does it do that?  string-to-multibyte is one of those functions
> that should never be used.

I don't know.  This is the first I've looked at the mm code.  I'll have
to do more investigation here, apparently.

>> In cases where the caller is expecting an xsd:string, the idea is for
>> soap-client to return a native Emacs string, for the caller's
>> convenience.
>
> But that's not what string-to-multibyte does.
>
>> I guess soap-client assumes that the mm and xml packages will do the
>> right thing to convert XML string values into Emacs's internal
>> format.
>
> I'm not sure we are not mis-communicating: conversion into internal
> format is what decoding does.  Whereas you just said a few messages
> upthread that you thought strings should be returned undecoded,
> i.e. as binary streams of bytes.  What am I missing?

The discussion expanded from being about xsd:base64Binary, to being
about all strings returned by soap-client (see below).  Upthread I was
saying only that xsd:base64Binary values should be returned undecoded.
I wasn't commenting on how other XSD string values (xsd:string, etc.)
should be returned.

>> >> Is the attached patch OK for master and emacs-25?
>> >
>> > Doesn't it bring back the bug which caused Andreas to make the change
>> > you want to undo?
>> 
>> It brings back the behavior of soap-client returning base64-decoded
>> xsd:base64Binary values as unibyte strings.
>
> I'm confused: you've just demonstrated that it returns them as
> multibyte strings with raw bytes in their multibyte encoding.
>
>> The debate on this thread is about whether that behavior is buggy or
>> not.  But yes, I want to revert Andreas's change on both master and
>> emacs-25 branches, because I don't consider the old behavior buggy.
>
> That'll bring the bug in the debbugs package back, I think.  Once
> again, if you want to return undecoded strings, they should at the
> very least be unibyte, not multibyte.  Apologies if I'm too confused
> to talk intelligently about this.

Apologies for helping lead to confusion; it's good to have you reviewing
soap-client's design.

The discussion expanded from being about how to handle xsd:base64Binary
values only (Andreas's patch), to about how soap-client handles all
strings (including xsd:string, etc.).  It could be that how soap-client
handles all strings is broken, since it appears to be relying on
string-to-multibyte which you're saying should never be used.  However,
soap-client's decoding has been good enough that no one has complained
about string handling in general up til now.  But I'll review the design
with Alex to see if we can avoid calling string-to-multibyte via mm.

Maybe I can give an example with XML fragments returned by the server,
to show how I think soap-client should handle xsd:base64Binary values.

The debbugs server will respond with:

<?xml version="1.0" encoding="UTF-8"?>
[...]
<severity xsi:type="xsd:string">normal</severity>
[...]
<originator 
xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
[...]

soap-client will parse those results into a structure that it returns to
the caller:

([...]
 (severity . "<string1>")
 [...]
 (originator . "<string2>")
 [...])

I think <string2> should be unibyte, because xsd:base64Binary represents
binary data, not necessarily a string.  It was unibyte before Andreas's
patch.  His patch changed it to be multibyte, by assuming the binary
data is a UTF-8 string and decoding it into Emacs's internal format.

What <string1> should be (unibyte or multibyte) and how it should be
produced (decoded) is the broader discussion.  I don't know enough to
have an opinion on that yet, other than it seems to have been working to
treat it as multibyte up until now.  Again, I'll have to talk to Alex
about this.

Thomas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]