[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#70007: [PATCH] native JSON encoder
From: |
Eli Zaretskii |
Subject: |
bug#70007: [PATCH] native JSON encoder |
Date: |
Wed, 27 Mar 2024 19:40:59 +0200 |
> From: Mattias Engdegård <mattias.engdegard@gmail.com>
> Date: Wed, 27 Mar 2024 16:49:53 +0100
> Cc: 70007@debbugs.gnu.org
>
> Here is an updated patch. It now ignores duplicated keys in objects
> represented by alists and plists, just like the old encoder. (I didn't
> include this in the first draft out of fear it would be slow and complicated,
> but it turned out just to be complicated.)
>
> The performance is still acceptable, which means at least 2x the speed of the
> Jansson-based encoder.
Thanks. A few initial comments and questions, based on a very cursory
reading.
> +/* JSON encoding context */
This is not our comment style.
> +typedef struct {
> + char *buf;
> + ptrdiff_t size; /* number of bytes in buf */
> + ptrdiff_t capacity; /* allocated size of buf */
> + ptrdiff_t chars_delta; /* size - {number of Unicode chars in buf} */
When you say "Unicode chars", what do you mean? characters or bytes?
If characters, then why do you need to qualify them with "Unicode"?
> +struct symset_tbl
> +{
> + /* Table used by the containing object if any, so that we can easily
> + all tables if an error occurs. */
> + struct symset_tbl *up;
> + /* Table of symbols (2**bits entries), Qunbound where unused. */
> + Lisp_Object entries[];
^^
Is this portable enough?
> +static struct symset_tbl *
> +alloc_symset_table (int bits)
> +{
> + struct symset_tbl *st = xmalloc (sizeof *st + (sizeof *st->entries <<
> bits));
> + int size = 1 << bits;
I'd add an assertion here that BITS is not large enough to produce zero.
> +/* Enlarge the table used by a symset. */
^^
Two spaces there, please.
> +static NO_INLINE void
> +symset_expand (symset_t *ss)
> +{
> + struct symset_tbl *old_table = ss->table;
> + int oldbits = ss->bits;
> + int oldsize = 1 << oldbits;
I'd add an assertion here about the magnitude of BITS.
> + while (p < end)
> + {
> + unsigned char c = *p;
> + if (json_plain_char[c])
> + {
> + json_out_byte (jo, c);
> + p++;
> + }
> + else if (c > 0x7f)
> + {
> + if (STRING_MULTIBYTE (str))
> + {
> + int n;
> + if (c <= 0xc1)
> + string_not_unicode (str);
> + if (c <= 0xdf)
> + n = 2;
> + else if (c <= 0xef)
> + {
> + int v = (((c & 0x0f) << 12)
> + + ((p[1] & 0x3f) << 6) + (p[2] & 0x3f));
> + if (char_surrogate_p (v))
> + string_not_unicode (str);
> + n = 3;
> + }
> + else if (c <= 0xf7)
> + {
> + int v = (((c & 0x07) << 18)
> + + ((p[1] & 0x3f) << 12)
> + + ((p[2] & 0x3f) << 6)
> + + (p[3] & 0x3f));
> + if (v > MAX_UNICODE_CHAR)
> + string_not_unicode (str);
> + n = 4;
> + }
> + else
> + string_not_unicode (str);
> + json_out_str (jo, (const char *)p, n);
> + jo->chars_delta += n - 1;
> + p += n;
> + }
> + else
> + string_not_unicode (str);
This rejects unibyte non-ASCII strings, AFAU, in which case I suggest
to think whether we really want that. E.g., why is it wrong to encode
a string to UTF-8, and then send it to JSON?
> +static void
> +json_out_float (json_out_t *jo, Lisp_Object f)
> +{
> + double x = XFLOAT_DATA (f);
> + if (isinf (x) || isnan (x))
> + signal_error ("not a finite number", f);
Is JSON unable to handle Inf and NaN?
> +static Lisp_Object
> +json_out_string_result (json_out_t *jo)
> +{
> + /* FIXME: should this be a unibyte or multibyte string?
> + Right now we make a multibyte string for test compatibility,
> + but we are really encoding so unibyte would make more sense. */
I indeed think this should be a unibyte string, because otherwise
writing it to a file or a process will/might encode it, which would be
wrong.
> + json_out_t jo = {
> + .maxdepth = 25,
Is this arbitrary, or is it what JSON expects? If arbitrary, should
it be customizable? should it be documented?
> + /* FIXME: Do we really need to do all this work below to insert a string?
> + Is there no function already written? I must be missing something. */
There is no function. All the insert_from_* functions in insdel.c do
something similar.
Btw, shouldn't json-insert call treesit_record_change? Yuan?
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/26
- bug#70007: [PATCH] native JSON encoder, Eli Zaretskii, 2024/03/26
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/27
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/27
- bug#70007: [PATCH] native JSON encoder,
Eli Zaretskii <=
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/27
- bug#70007: [PATCH] native JSON encoder, Eli Zaretskii, 2024/03/27
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/28
- bug#70007: [PATCH] native JSON encoder, Eli Zaretskii, 2024/03/29
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/30
- bug#70007: [PATCH] native JSON encoder, Eli Zaretskii, 2024/03/30
- bug#70007: [PATCH] native JSON encoder, Mattias Engdegård, 2024/03/30
- bug#70007: [PATCH] native JSON encoder, Richard Copley, 2024/03/30
- bug#70007: [PATCH] native JSON encoder, Eli Zaretskii, 2024/03/30
- bug#70007: [PATCH] native JSON encoder, Richard Copley, 2024/03/30