[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP

From: Eli Zaretskii
Subject: Re: emacs-25 b6b47AF: Properly encode/decode base64Binary data in SOAP
Date: Sun, 13 Mar 2016 22:19:47 +0200

> From: Thomas Fitzsimmons <address@hidden>
> Cc: address@hidden,  address@hidden
> Date: Sun, 13 Mar 2016 15:54:34 -0400
> Eli Zaretskii <address@hidden> writes:
> >> From: Thomas Fitzsimmons <address@hidden>
> >> Cc: address@hidden,  address@hidden
> >> Date: Sun, 13 Mar 2016 13:57:32 -0400
> >> 
> >>    (defun soap-parse-server-response ()
> >>      "Error-check and parse the XML contents of the current buffer."
> >>      (let ((mime-part (mm-dissect-buffer t t)))
> >>        (unless mime-part
> >>          (error "Failed to decode response from server"))
> >>        (unless (equal (car (mm-handle-type mime-part)) "text/xml")
> >>          (error "Server response is not an XML document"))
> >>        (with-temp-buffer
> >>          (mm-insert-part mime-part)
> >>          (prog1
> >>              (car (xml-parse-region (point-min) (point-max)))
> >>            (kill-buffer)
> >>            (mm-destroy-part mime-part)))))
> >> 
> >> mm-insert-part does:
> >> 
> >>    (string-to-multibyte (mm-get-part handle no-cache))
> >
> > Why does it do that?  string-to-multibyte is one of those functions
> > that should never be used.
> I don't know.  This is the first I've looked at the mm code.  I'll have
> to do more investigation here, apparently.

IME, mm-decode has a lot of baggage from distant past, when it had to
deal with bugs in Emacs, with incompatibilities between Emacs and
XEmacs, and from its own misconceptions in the area of encoding and
decoding text.  Its code should be carefully reviewed for correctness.

> Upthread I was saying only that xsd:base64Binary values should be
> returned undecoded.

But it currently doesn't, AFAIU, since this:

  (cdr (assq 'severity (car (debbugs-get-status 22285))))

is returned as a multibyte string.  So I guess you are saying that
soap-client needs some changes to return xsd:base64Binary data as
unibyte strings?  Or is "severity" an xsd:string?

> I wasn't commenting on how other XSD string values (xsd:string, etc.)
> should be returned.

In general, if it's known to be a text string, and its encoding is
specified by the document, or can be deduced otherwise with a 100%
certainty, I'd recommend to return decoded strings.

> Maybe I can give an example with XML fragments returned by the server,
> to show how I think soap-client should handle xsd:base64Binary values.
> The debbugs server will respond with:
> <?xml version="1.0" encoding="UTF-8"?>
> [...]
> <severity xsi:type="xsd:string">normal</severity>
> [...]
> <originator 
> xsi:type="xsd:base64Binary">Q2zDqW1lbnQgUGl0LS1DbGF1ZGVsIDxjbGVtZW50LnBpdGNsYXVkZWxAbGl2ZS5jb20+</originator>
> [...]
> soap-client will parse those results into a structure that it returns to
> the caller:
> ([...]
>  (severity . "<string1>")
>  [...]
>  (originator . "<string2>")
>  [...])
> I think <string2> should be unibyte, because xsd:base64Binary represents
> binary data, not necessarily a string.

Btw, why is "originator" not a string? why xsd:base64Binary?  It's a
name of a human (or some other entity), so it's clearly text, no?

> What <string1> should be (unibyte or multibyte) and how it should be
> produced (decoded) is the broader discussion.  I don't know enough to
> have an opinion on that yet, other than it seems to have been working to
> treat it as multibyte up until now.  Again, I'll have to talk to Alex
> about this.

If you can reliably decode it, then multibyte and decoded is better.
I'd also say that if it's known that xsd:base64Binary is a string in
disguise, it should also be decoded.  IOW, whenever the data is a
string, it is better to decode it, I agree with Andreas here: unibyte
strings that represent text are a PITA without a good justification.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]