emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Implement fast verisons of json-parse functions


From: Herman , Géza
Subject: Re: [PATCH] Implement fast verisons of json-parse functions
Date: Sat, 30 Mar 2024 19:36:57 +0100


Eli Zaretskii <eliz@gnu.org> writes:

From: Herman, Géza <geza.herman@gmail.com>
3 test failures:
1. Handling of utf-8 decode errors: the new parser emits
json-utf8-decode-error instead of json-parse-error (this is what
the test expects).  I can fix this by modifying the test

OK, but we will need to mention this in NEWS as an incompatible
change.

Yes. I'm just mentioning this as an alternative solution: originally the parser emitted json-parse-error for this, it was changed during the review. So if we prefer maintaining compatibility, it's easy to revert this change.

2. Handling of a single \0 byte

Does JSON allow null bytes in its strings?  If not, why
wrong-type-argument is not TRT?

That's correct, null bytes are not allowed (anywhere, not just in strings). But my point is that the old parser made a special distinction here. It is not just null bytes which is not allowed in JSON, but for example, \x01 isn't allowed either. But, for null bytes, the old parser gives a different error message than for \x01 bytes. But from the JSON spec perspective, both \x00 and \x01 are forbidden in the same way. I don't know why null bytes are handled specially in this regard, so I didn't follow this behavior in my parser. Maybe this special error case was added because libjansson couldn't parse strings with null bytes back then (because the API only accepted zero-terminated strings)?

To me, wrong-type-argument means that the input argument to the parser is incorrect. Like it's not a string, but an integer. But here, the parser gets a string, it's just that the string has null bytes in it somewhere. The type of the argument to json-parse-* is fine, it's the value which has the problem. So the parser should give some kind of json-error in my opinion, not wrong-type-argument. But, of, course, if we consider strings-with-null and strings-without-null as two different types, then the wrong-type-argument error makes sense (though I don't know why we'd want to do this).

3. Handling objects with duplicate keys.

I think we should modify the expected results of the test to match the
new behavior, and leave the order as it is now.

OK.

But please also compare with what the Lisp implementation does in
these cases, as that could give us further ideas or make us
reconsider.

I checked json-read, and it seems that it has the exact same behavior that my parser has. I thought that json-read can only produce one format, but it turned out it has json-object-type and json-array-type variables, so it can produce the same variety of output that the C-based parsers can do. I think that the doc of json-read should mention this fact. Anyways, the doc says:

(defvar json-object-type 'alist
 "Type to convert JSON objects to.
Must be one of `alist', `plist', or `hash-table'. Consider let-binding this around your call to `json-read' instead of `setq'ing it. Ordering
is maintained for `alist' and `plist', but not for `hash-table'.")

I played with this a little bit, and it works as described (for hash tables, it keeps the last key-value pair).

I think this behavior is important, because this is used when pretty-formatting JSON. Pretty formatting shouldn't remove duplicate entries, nor change the ordering of members. Because the new parser also behaves like this, it can be used to speed up pretty formatting as well (yeah, I know, half of it, as there is no new to-JSON serializer implemented yet).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]