[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of i
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of input" |
Date: |
Tue, 28 Aug 2018 06:28:59 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Eric Blake <address@hidden> writes:
> On 08/27/2018 02:00 AM, Markus Armbruster wrote:
>> When the lexer isn't in its start state at the end of input, it's
>> working on a token. To flush it out, it needs to transit to its start
>> state on "end of input" lookahead.
>>
>> There are two ways to the start state, depending on the current state:
>>
>> * If the lexer is in a TERMINAL(JSON_FOO) state, it can emit a
>> JSON_FOO token.
>>
>> * Else, it can go to IN_ERROR state, and emit a JSON_ERROR token.
>>
>> There are complications, however:
>>
>> * The transition to IN_ERROR state consumes the input character and
>> adds it to the JSON_ERROR token. The latter is inappropriate for
>> the "end of input" character, so we suppress that. See also recent
>> commit "json: Fix lexer to include the bad character in JSON_ERROR
>> token".
>
> Now commit a2ec6be7
I'll update the commit message.
>>
>> * The transition to a TERMINAL(JSON_FOO) state doesn't consume the
>> input character. In that case, the lexer normally loops until it is
>> consumed. We have to suppress that for the "end of input" input
>> character. If we didn't, the lexer would consume it by entering
>> IN_ERROR state, emitting a bogus JSON_ERROR token. We fixed that in
>> commit bd3924a33a6.
>>
>> However, simply breaking the loop this way assumes that the lexer
>> needs exactly one state transition to reach its start state. That
>> assumption is correct now, but it's unclean, and I'll soon break it.
>> Clean up: instead of breaking the loop after one iteration, break it
>> after it reached the start state.
>>
>> Signed-off-by: Markus Armbruster <address@hidden>
>> ---
>> qobject/json-lexer.c | 17 +++++++++--------
>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
>> index 4867839f66..ec3aec726f 100644
>> --- a/qobject/json-lexer.c
>> +++ b/qobject/json-lexer.c
>> @@ -261,7 +261,8 @@ void json_lexer_init(JSONLexer *lexer, bool
>> enable_interpolation)
>> static void json_lexer_feed_char(JSONLexer *lexer, char ch, bool
>> flush)
>> {
>> - int char_consumed, new_state;
>> + int new_state;
>> + bool char_consumed = false;
>
> Yay for the switch to bool.
>
> Reviewed-by: Eric Blake <address@hidden>
Thanks!
- [Qemu-devel] [PATCH 0/6] json: More fixes, error reporting improvements, cleanups, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 1/6] json: Fix lexer for lookahead character beyond '\x7F', Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 2/6] json: Clean up how lexer consumes "end of input", Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 3/6] json: Make lexer's "character consumed" logic less confusing, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 4/6] json: Nicer recovery from lexical errors, Markus Armbruster, 2018/08/27
- [Qemu-devel] [PATCH 5/6] json: Eliminate lexer state IN_ERROR, Markus Armbruster, 2018/08/27