emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: I created a faster JSON parser


From: Herman , Géza
Subject: Re: I created a faster JSON parser
Date: Sun, 10 Mar 2024 07:58:12 +0100


Christopher Wellons <wellons@nullprogram.com> writes:

What do you think?

In review I noticed a potential pointer overflow in json_parse_string:

parser->input_current + 4 <= parser->input_end

[...]

In json_make_object_workspace_for and json_byte_workspace_put, a size
is doubled without an overflow check ("new_workspace_size * 2").

Thanks for the review and finding these problems! I fixed them: https://github.com/geza-herman/emacs/commit/cbbf3dd494034750ff324703e64f1125a1056832.patch

But this JSON parser is tightly
coupled with the Emacs Lisp runtime, which greatly complicates
things. I couldn't simply pluck it out by itself and drop it in, say,
AFL++.

Yes, it needs some work. The Lisp Object creation part is only done at very specific places, it's easy to remove them (actually, I wrote this parser outside of Emacs, and then just put it in by adding the necessary Lisp Object creation code). Or, if the fuzzer needs the actual output (I mean, the result of the parsing), it shouldn't be too hard to put some code there which provides the output. The other thing is error handling, but it also can be easily replaced by using longjmp.

I'm happy to do this work, I'd just need some directions how to do it. I'm not experienced with fuzzy testing, so if you are, I'd glad if you can give some advices: which fuzzy-testing framework to use, which introductory material is worth reading, etc.

As noted earlier, the parser gets its performance edge through
skipping the intermediate steps. This is great! That could still be accomplished without such tight coupling, allowing for performance
*and* an interface that is testable and fuzzable in relative
isolation.

Yes, I think a SAX parser like interface would have a very little cost. But honestly, I don't see the point of it. This is a parser for Emacs only. It has a very specific purpose, to make JSON parsing fast in Emacs. It is a small module. Input is JSON, output is Lisp Objects. Working with Lisp Objects inside Emacs is a natural thing, usually there is no need for intermediate representations. So if the only reason to have a Emacs-independent API is to make the parser fuzzy-testable, then wouldn't it make more sense to make Emacs fuzzy-testable in general? I find this approach more useful, because I think it's not just this parser which can be a sensible target for fuzzy testing.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]