Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in ad

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in ad

From:	Markus Armbruster
Subject:	Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size
Date:	Fri, 30 Oct 2015 08:52:31 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 10/29/2015 12:27 PM, Markus Armbruster wrote:
>>> Sounds like we have some quadratic (or worse) scaling in the parser.
>>> Worth fixing some day, but I also agree that we don't have to tackle it
>>> in this series.
>> 
>> I believe it's linear with a criminally negligent constant (several KiB
>> per token).  The first hog is actually fairly obvious: we use on QDict
>> per token.
>
> Wow.  I just read through the code, and you're right.  We are passing
> around QDicts right and left (one per token, with 4 keys, which is
> several mallocs per token), and then creating a QList of those tokens.

Disgusting, isn't it?

> Prior to commit 65c0f1e9, when we were really incrementing the refcount
> of each token on each level of nesting (as part of copying context, for
> definite quadratic behavior), the refcounts let us ensure tokens would
> be cleaned up at the end.  But I'm hard pressed to see the refcount of
> tokens going above one in the current code, which means we aren't
> gaining anything by using QDict for reference counting.  For that
> matter, JSON is quite linear - the code talks about needing to backtrack
> in different contexts, but JSON doesn't have ambiguities that need
> backtracking to try multiple different rules.  It seems like the code is
> overengineered because it is copied from another language where
> backtracking to try several alternate parses actually makes sense.

Yes, starting with a copy of an inappropriate solution is a possible
explanation for this mess.

> I suspect that using a C struct per token, and a C array of those
> structs, would go a long way towards alleviating memory abuse per token.

Yes.

> Are you tackling that, or would you like me to take a stab at it while
> you work on flushing my pending qapi patches?

I think I could use a bit of hacking to clear my mind between bouts of
patch review.  This job should be about the right size.  Plus, I better
become more familiar with qobject/json-* --- Luiz is dying to get rid of
it ;)

>>> I'm assuming you temporarily patched check-qjson to use larger constants
>>> when you hit your ~100K token testing?  Because I am definitely seeing a
>>> lot of execution time spent on large_dict when running tests/check-qjson
>>> by hand, in relation to all the other tests of that file, but not
>>> minutes worth.  Care to post the diff you played with?
>> 
>> I tested on a slow machine.
>
> I guess it was all the malloc pressure on a low-memory system that would
> make it so much slower than what I'm seeing, if you stuck with the
> default gen_test_json(gstr, 10, 100).

Something must have been wrong with this machine yesterday, because
today it runs the exact same test much, much faster.  Computers...

I'll update the commit message.  Thanks for your sanity check!

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH 0/4] json-streamer: Fix up code to limit nesting and size, Markus Armbruster, 2015/10/29
- [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Markus Armbruster, 2015/10/29
  - Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Eric Blake, 2015/10/29
    - Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Markus Armbruster, 2015/10/29
    - Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Eric Blake, 2015/10/29
    - Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Markus Armbruster <=
    - Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size, Eric Blake, 2015/10/30
- [Qemu-devel] [PATCH 1/4] json-streamer: Apply nesting limit more sanely, Markus Armbruster, 2015/10/29
  - Re: [Qemu-devel] [PATCH 1/4] json-streamer: Apply nesting limit more sanely, Eric Blake, 2015/10/29
- [Qemu-devel] [PATCH 3/4] check-qjson: Add test for JSON nesting depth limit, Markus Armbruster, 2015/10/29
  - Re: [Qemu-devel] [PATCH 3/4] check-qjson: Add test for JSON nesting depth limit, Eric Blake, 2015/10/29
    - Re: [Qemu-devel] [PATCH 3/4] check-qjson: Add test for JSON nesting depth limit, Markus Armbruster, 2015/10/29
- [Qemu-devel] [PATCH 2/4] json-streamer: Don't crash when input exceeds nesting limit, Markus Armbruster, 2015/10/29
  - Re: [Qemu-devel] [PATCH 2/4] json-streamer: Don't crash when input exceeds nesting limit, Eric Blake, 2015/10/29

Prev by Date: Re: [Qemu-devel] [PATCH] monitor: Plug memory leak on QMP error
Next by Date: Re: [Qemu-devel] [PATCH v1 00/15] data-driven device registers
Previous by thread: Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size
Next by thread: Re: [Qemu-devel] [PATCH 4/4] json-streamer: Limit number of tokens in addition to total size
Index(es):
- Date
- Thread