Re: JSON/YAML/TOML/etc. parsing performance

From: Philipp Stephani
Subject: Re: JSON/YAML/TOML/etc. parsing performance
Date: Tue, 19 Sep 2017 08:18:14 +0000

Philipp Stephani <address@hidden> schrieb am Mo., 18. Sep. 2017 um 15:26 Uhr:
Philipp Stephani <address@hidden> schrieb am So., 17. Sep. 2017 um 20:46 Uhr:
Ted Zlatanov <address@hidden> schrieb am Sa., 16. Sep. 2017 um 17:55 Uhr:
I wanted to ask if there's any chance of improving the parsing
performance of JSON, YAML, TOML, and similar data formats. It's pretty
poor today.

That could be done in the core with C code, improved Lisp code,
integration with an external library, or a mix of those.

I don't know much about the others, but given the importance of JSON as data exchange and serialization format, I think it's worthwhile to invest some time here. I've implemented a wrapper around the json-c library (license: Expat/X11/MIT), resulting in significant speedups using the test data from https://github.com/miloyip/nativejson-benchmark: a factor of 3.9 to 6.4 for parsing, and a factor of 27 to 67 for serializing. If people agree that this is useful I can send a patch.

I've discovered that the interface and documentation of Jansson are much better than the ones of json-c, so I switched to Jansson. I've attached a patch.

Here's a newer version of the patch. The only significant difference is that now the Lisp values for JSON null and false are :null and :false, respectively. Using a dedicated symbol for :null reduces the mental overhead of the triple meaning of nil (null, false, empty list), and is more future-proof, should we ever want to support lists. 

