[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Re: [PATCH 09/11] json-lexer: limit the maximum size of
From: |
Luiz Capitulino |
Subject: |
Re: [Qemu-devel] Re: [PATCH 09/11] json-lexer: limit the maximum size of a given token |
Date: |
Mon, 14 Mar 2011 17:33:11 -0300 |
On Mon, 14 Mar 2011 15:18:57 -0500
Anthony Liguori <address@hidden> wrote:
> On 03/14/2011 02:25 PM, Luiz Capitulino wrote:
> > On Fri, 11 Mar 2011 15:00:47 -0600
> > Anthony Liguori<address@hidden> wrote:
> >
> >> This is a security consideration. We don't want a client to cause an
> >> arbitrary
> >> amount of memory to be allocated in QEMU. For now, we use a limit of 64MB
> >> which should be large enough for any reasonably sized token.
> >>
> >> This is important for parsing JSON from untrusted sources.
> >>
> >> Signed-off-by: Anthony Liguori<address@hidden>
> >>
> >> diff --git a/json-lexer.c b/json-lexer.c
> >> index 834d7af..3462c89 100644
> >> --- a/json-lexer.c
> >> +++ b/json-lexer.c
> >> @@ -18,6 +18,8 @@
> >> #include "qemu-common.h"
> >> #include "json-lexer.h"
> >>
> >> +#define MAX_TOKEN_SIZE (64ULL<< 20)
> >> +
> >> /*
> >> *
> >> \"([^\\\"]|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*\"
> >> *
> >> '([^\\']|(\\\"\\'\\\\\\/\\b\\f\\n\\r\\t\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]))*'
> >> @@ -312,6 +314,17 @@ static int json_lexer_feed_char(JSONLexer *lexer,
> >> char ch)
> >> }
> >> lexer->state = new_state;
> >> } while (!char_consumed);
> >> +
> >> + /* Do not let a single token grow to an arbitrarily large size,
> >> + * this is a security consideration.
> >> + */
> >> + if (lexer->token->length> MAX_TOKEN_SIZE) {
> >> + lexer->emit(lexer, lexer->token, lexer->state, lexer->x,
> >> lexer->y);
> >> + QDECREF(lexer->token);
> >> + lexer->token = qstring_new();
> >> + lexer->state = IN_START;
> >> + }
> > Entering an invalid token is an error, we should fail here.
>
> It's not so clear to me.
>
> I think of it like GCC. GCC doesn't bail out on the first invalid
> character the lexer encounters. Instead, it records that an error has
> occurred and tries its best to recover.
>
> The result is that instead of getting an error message about the first
> error in your code, you'll get a long listing of all the mistakes you
> made (usually).
I'm not a compiler expert, but I think compilers do this because it would be
very problematic from a usability perspective to report one error at time:
by reporting most errors per run, the compiler gives the user to fix as much
as possible programming mistakes w/o having to re-run.
Our target are not humans, though.
> One thing that makes this more difficult in our case is that when you're
> testing, we don't have a clear EOI to flush things out. So a bad
> sequence of inputs might make the message parser wait for an a character
> that you're not necessarily inputting which makes the session appear
> hung. Usually, if you throw a couple extra brackets around, you'll get
> back to valid input.
Not sure if this is easy to do, but shouldn't we fail if we don't get
the expected character?
> > Which brings
> > two features:
> >
> > 1. A test code could trigger this condition and check for the specific
> > error code
> >
> > 2. Developers will know when they hit the limit. Although I don't expect
> > expect this to happen, there was talking about adding base64 support
> > to transfer something (I can't remember what, but we never know how the
> > protocol will evolve).
> >
> > Also, by testing this I found that the parser seems to get confused when
> > the limit is reached: it stops responding.
>
> Actually, it does respond. The lexer just takes an incredibly long time
> to process a large token because qstring_append_ch is incredibly slow
> :-) If you drop the token size down to 64k instead of 64mb, it'll seem
> a lot more reasonable.
Actually, I dropped it down to 20 :) This is supposed to work like your 64k
suggestion, right?
>
> Regards,
>
> Anthony Liguori
>
> >> +
> >> return 0;
> >> }
> >>
> >
>
- Re: [Qemu-devel] Re: [PATCH 02/11] qerror: expose a function to format an error, (continued)
[Qemu-devel] [PATCH 01/11] Add hard build dependency on glib, Anthony Liguori, 2011/03/11
[Qemu-devel] [PATCH 09/11] json-lexer: limit the maximum size of a given token, Anthony Liguori, 2011/03/11
[Qemu-devel] [PATCH 03/11] add a generic Error object, Anthony Liguori, 2011/03/11
[Qemu-devel] [PATCH 10/11] json-streamer: limit the maximum recursion depth and maximum token count, Anthony Liguori, 2011/03/11
[Qemu-devel] [PATCH 08/11] json-lexer: reset the lexer state on an invalid token, Anthony Liguori, 2011/03/11