[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character bey
Re: [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F'
Tue, 28 Aug 2018 06:28:30 +0200
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
Eric Blake <address@hidden> writes:
> On 08/27/2018 02:00 AM, Markus Armbruster wrote:
>> The lexer fails to end a valid token when the lookahead character is
>> beyond '\x7F'. For instance, input
>> produces the tokens
>> JSON_ERROR true\xC2
>> JSON_ERROR \xA2
>> The first token should be
>> JSON_KEYWORD true
> As long as we still get a JSON_ERROR in the end.
We do: one for \xC2, and one for \xA2. PATCH 4 will lose the second one.
>> The culprit is
>> #define TERMINAL(state) [0 ... 0x7F] = (state)
>> It leaves [0x80..0xFF] zero, i.e. IN_ERROR. Has always been broken.
> I wonder if that was done because it was assuming that valid input is
> only ASCII, and that any byte larger than 0x7f is invalid except
> within the context of a string.
> But whatever the reason for the
> original bug, your fix makes sense.
>> Fix it to initialize the complete array.
> Worth testsuite coverage?
Since lookahead bytes > 0x7F are always a parse error, all the bug can
do is swallow a TERMINAL() token right before a parse error. The
TERMINAL() tokens are JSON_INTEGER, JSON_FLOAT, JSON_KEYWORD, JSON_SKIP,
JSON_INTERP. Fairly harmless. In particular, JSON objects get through
even when followed by a byte > 0x7F.
Of course, test coverage wouldn't hurt regardless.
>> Signed-off-by: Markus Armbruster <address@hidden>
>> qobject/json-lexer.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
> Reviewed-by: Eric Blake <address@hidden>
- [Qemu-devel] [PATCH 0/6] json: More fixes, error reporting improvements, cleanups, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F', Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of input", Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 3/6] json: Make lexer's "character consumed" logic less confusing, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 4/6] json: Nicer recovery from lexical errors, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 5/6] json: Eliminate lexer state IN_ERROR, Markus Armbruster, 2018/08/27