qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 24/56] json: Accept overlong \xC0\x80 as U+0000


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH 24/56] json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8")
Date: Fri, 10 Aug 2018 10:48:47 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 08/08/2018 07:03 AM, Markus Armbruster wrote:
This is consistent with qobject_to_json().  See commit e2ec3f97680.

Side note: that commit mentions that on output, ASCII DEL (0x7f) is always escaped. RFC 7159 does not require it to be escaped on input, but I wonder if any of your earlier testsuite improvements should specifically cover \x7f vs. \u007f on input being canonicalized to \u007f on round trip output.


Signed-off-by: Markus Armbruster <address@hidden>
---
  qobject/json-lexer.c  | 2 +-
  qobject/json-parser.c | 2 +-
  tests/check-qjson.c   | 8 +-------
  3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index ca1e0e2c03..36fb665b12 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -93,7 +93,7 @@
   *   interpolation = %((l|ll|I64)[du]|[ipsf])
   *
   * Note:
- * - Input must be encoded in UTF-8.
+ * - Input must be encoded in modified UTF-8.

Worth documenting this in the QMP doc as an explicit extension? In general, our QMP interfaces that take binary input do so via base64 encoding, rather than via a modified UTF-8 string - and I don't know how yajl or jansson would feel about an extension for producing modified UTF-8 for QMP to consume if we really did want to pass NUL bytes without the overhead of UTF-8; what's more, even if you can pass NUL, you still have to worry about all other byte sequences being valid (so base64 is still better for true binary data - it's hard to argue that we'd ever have an interface where we want UTF-8 including embedded NUL rather than true binary). I guess it can also be argued that outputting modified UTF-8 is a violation of JSON, so the fact that we can round-trip NUL doesn't help if the client can't read it.

So having typed all that, I guess the answer is no, we don't want to document it; for now, the fact that we accept \xc0\x80 on input and produce it on output is only for the testsuite, and unlikely to matter to any real client of QMP.

Reviewed-by: Eric Blake <address@hidden>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]