Re: bug#23750: 25.0.95; bug in url-retrieve or json.el

From:

Philipp Stephani

Subject:

Date:

Tue, 29 Nov 2016 23:09:57 +0000

Eli Zaretskii <address@hidden> schrieb am Di., 29. Nov. 2016 um 18:24 Uhr:

> From: Dmitry Gutov <address@hidden>
> Date: Tue, 29 Nov 2016 13:05:39 +0200
> Cc: address@hidden
>
> On 29.11.2016 13:03, Kentaro NAKAZAWA wrote:
>
> > (let* ((content (encode-coding-string
> > "ほげ <- VALID utf-8 Japanese multibyte text"
> > 'utf-8))
> > (url "https://api.github.com/gists")
> > (url-request-method "POST")
> > (url-request-data
> > (json-encode
> > `(("description" . "test")
> > ("public" . false)
> > ("files" . (("test.txt" . (("content" . ,content)))))))))
> > (with-current-buffer (url-retrieve-synchronously url)
> > (buffer-string)))
>
> json-encode returns a multibyte string.

Any idea why?

Because (symbol-name 'false) returns a multibyte string. I guess the ultimate reason is that the reader always creates multibyte strings for symbol names.

Is it again that 'concat' misfeature, when one of the
strings is pure-ASCII, but happens to be multibyte?

Why is it a misfeature? I'd expect a concatenation of multibyte and unibyte strings to either implicitly upgrade to as multibyte string (as in Python 2) or raise a signal (as in Python 3).

That url-retrieve breaks in this case is unfortunate, but I guess we can't do much about it without breaking other stuff. Maybe the behavior regarding unibyte and multibyte strings (e.g. what kinds of strings the reader and `concat' generate) should simply be documented.