[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 41/60] json: Nicer recovery from invalid lead

From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v2 41/60] json: Nicer recovery from invalid leading zero
Date: Mon, 20 Aug 2018 13:36:35 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/20/2018 06:39 AM, Markus Armbruster wrote:

In review of v1, we discussed whether to try matching non-integer
numbers with redundant leading zero.  Doing that tightly in the lexer
requires duplicating six states.  A simpler alternative is to have the
lexer eat "digit salad" after redundant leading zero: 0[0-9.eE+-]+.
Your suggestion for hexadecimal numbers is digit salad with different
digits: [0-9a-fA-FxX].  Another option is their union: [0-9a-fA-FxX+-].
Even more radical would be eating anything but whitespace and structural
characters: [^][}{:, \t\n\r].  That idea pushed to the limit results in
a two-stage lexer: first stage finds token strings, where a token string
is a structural character or a sequence of non-structural,
non-whitespace characters, second stage rejects invalid token strings.

Hmm, we could try to recover from lexical errors more smartly in
general: instead of ending the JSON error token after the first
offending character, end it before the first whitespace or structural
character following the offending character.

I can try that, but I'd prefer to try it in a follow-up patch.

Indeed, that sounds like a valid approach. So, for this patch, I'm fine with just accepting ['0' ... '9'], then seeing if the later smarter-lexing change makes back-to-back non-structural tokens give saner error messages in general.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]