[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 4/9] monitor: no need to save need_resume

From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v3 4/9] monitor: no need to save need_resume
Date: Wed, 29 Aug 2018 10:55:51 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Peter Xu <address@hidden> writes:

> On Tue, Aug 28, 2018 at 05:46:29PM +0200, Markus Armbruster wrote:
>> Peter Xu <address@hidden> writes:
>> > On Sat, Aug 25, 2018 at 03:57:19PM +0200, Marc-André Lureau wrote:
>> >> There is no need for per-command need_resume granularity, it should
>> >> resume after running an non-oob command on oob-disabled monitor.
>> >> 
>> >> Signed-off-by: Marc-André Lureau <address@hidden>
>> >> Reviewed-by: Markus Armbruster <address@hidden>
>> >
>> > Note that this series/patch still conflict with the "enable
>> > out-of-band by default" series.
>> >
>> >   [PATCH v6 00/13] monitor: enable OOB by default
>> Yes.
>> > I'm not against this patch to be merged since it has its r-b, but I
>> > feel like we'd better judge on whether we still like the response
>> > queue first, in case one day we'll need to add these things back.
>> Let's not worry about things that may or may not happen at some
>> indeterminate time in the future.
> It might not be that "far future"...  Please see below.
>> However:
>> > When there could be functional changes around the code path I would
>> > think we'd better keep the cleanup patches postponed a bit until those
>> > functional changes are settled.  For now the functional part is decide
>> > how to fix up the rest of out-of-band issues (my proposal is in the
>> > series above which should solve everything that is related to
>> > out-of-band to be fixed; if there is more, I'll continue to work on
>> > it), whether we should enable it by default for 3.1 (my answer
>> > is... yes...), and what to do with it.
>> I agree the important job is to finish OOB.
>> Sometimes, it's better to clean up first.  Sometimes, it's not.
>> Right now, the response queue is a useless complication, and
>> Marc-André's PATCH 3+4 get rid of it.  Lovely.  I understand this
>> conflicts with your OOB work.  The question is whether your work
>> fundamentally needs the response queue or not.
> Just to clarify a bit... I prefer to keep the response queue not
> because it's conflicting my existing work but because I think we might
> get use of it even in the near future.  I stated it here on the
> possibility that we might use the response queue to solve the
> unlimited monitor out_buf issue here:
> https://patchwork.kernel.org/patch/10511471/#22110771
> Quotes:
>         ...
>         Yeah actually this reminded me about the fact that we are
>         still using unlimited buffer size for the out_buf.  IMHO we
>         should make it a limited size after 3.0, then AFAICT if
>         without current qmp response queue we'll need to introduce
>         some similar thing to cache responses then when the out_buf is
>         full.
>         IMHO the response queue looks more like the correct place that
>         we should do the flow control, and the out_buf could be
>         majorly used to do the JSON->string convertion (so basically
>         IMHO the out_buf limitation should be the size of the maximum
>         JSON object that we'll have).
>         ...
> Let's imagine what we need if we want to limit the out_buf: (1) To
> limit the out_buf, we need somewhere to cache the responses that we
> want to put into out_buf but we can't when the out_buf is full -
> that's mostly the response queue now.  (2) Since we will need to queue
> these items onto out_buf when out_buf is not full, we'll possibly need
> something like a bottom half to kickoff when out_buf is able to handle
> more data - that's mostly the bottom half of the response queue.
> AFAIU the rest to do is very possible only that we set a limit to the
> out_buf but only if response queue is there...

Limiting outbuf only to have the same data pile up earlier in the
pipeline doesn't seem helpful.  We need to throttle production of
output.  A simple way to do that is throttling input.  See below.

> I didn't really work on the out_buf since I didn't want to further
> expand the out-of-band work (since it's already going far away before
> it settles down first...), and after all the out_buf issue is nothing
> related to the out-of-band work and it's there for a long time.
> However that's the major reason that I might prefer to keep the queue
> now.
> [1]

Let's review what we have.

QMP flow control is about limiting the amount of data QEMU buffers on
behalf of a QMP client.

This is about robustness, not security.  There are countless ways a QMP
client can make QEMU use more memory.  We want to cope with accidents,
not stop attacks.

A common and simple way to do flow control is to throttle receiving.
Unless the transport buffers way too much (which would be insane), the
peer will soon notice, and can adapt.

QMP input flows from the character device to commands.  It is converted
from text to QObject along the way.  Output flows from commands to the
character device.  It is converted from QObject to text along the way.

When the client sends input faster than QEMU can execute them, flow
control should kick in to stop adding more input to the pileup.  When
QEMU produces output faster than the client can receive it, flow control
should kick in to stop adding more output to the pileup.  We can do that
indirectly, by stopping input.

Input buffering:

* The chardev buffers 4KiB (I think).  Small enough to be ignored.

* The JSON parser buffers one partial JSON value as a token list, up to
  a limit in the order of 64MiB.  Weasel words "in the order", because
  it measures memory consumption only indirectly.  The limit is
  ridicilously generous for a control plane purpose like QMP.

* When the partial JSON value becomes complete, the JSON parser converts
  it to a QObject, then frees the token list.

* Without OOB, the QMP core buffers one complete QObject (the command)
  until we're done with the command.  The JSON parser's buffer should be
  empty then, because the QMP core suspends reading from the char device
  while it deals with a command.

* With OOB, the QMP core buffers up to 8 in-band commands and one
  out-of-band command.

  When the in-band command buffer is full, we currently drop further
  in-band commands, but that's a bad idea, and we're going to suspend
  reading from the char device instead.  Once we do, the JSON parser's
  buffer should be empty when the in-band command buffer is full.  The
  remainder of my analysis assumes we suspend rather than drop.

Output buffering:

* Traditionally, the QMP core buffers one QObject while it converts it
  to text.  It buffers an unlimited amount of text in mon->outbuf.

* Adding a request queue doesn't by itself change how much data is
  buffered, only when it's converted from QObject to text.  If the
  bottom half doing the conversion runs quickly, nothing changes.  If it
  gets delayed for some reason, QObjects can pile up in the response
  queue before they get converted and moved to mon->outbuf.

Summary of flow control:

* We limit input buffers and stop reading input when the limit is
  exceeded.  This stops input piling up.

* We do not limit output at all.

Ideally, we'd keep track of combined input and output buffer size, and
throttle input to keep it under control.  But separate limits for
individual buffers could be good enough, and might be simpler.

>> If your OOB work happens to be coded for the response queue, but the
>> problem could also be solved without the response queue, then the OOB
>> job doesn't fundamentally need the response queue.
> Yes, I think the OOB work itself does not need the response queue now.


>> Unless that's the case, getting rid of the response queue is unnecessary
>> churn.
>> If it is the case, we still need to consider effort.  Which order is
>> less total work?  Which order gets us to the goal faster?
>> Can you guys agree on answers to these questions, or do you need me to
>> answer them?
>> Restating the questions:
>> 1. Can you think of a way to do what Peter's OOB series does, but
>> without the response queue?
>> 2. If you can, what's easier / cheaper / faster:
>>    a. Merge Marc-André's patches to get rid of response queue, rewrite
>>       OOB series without response queue on top.
>>    b. Merge Peter's OOB series with response queue, rewrite patches to
>>       get rid of response queue on top.
> Let's have a quick look at above [1], if it's not a good reason (or
> even it's still unclear) then let's drop the queue.  It'll be
> perfectly fine I rebase my work upon Marc-André's.  After all the only
> reason to keep the response queue for me is to save time (for anyone
> who will be working with monitors).  If we spend too much time on
> judging whether we should keep the queue (we've already spent some)
> then it's already a waste of time...  It does not worth it IMO.

Time spent on coming up with a high-level plan for flow control is time
well spent.

Sometimes, you have to explore and experiment before you can come up
with a high-level plan that has a reasonable chance of working.  No
license to skip the "think before you hack" step entirely.

Here's the simplest (and possibly naive) plan I can think of: if
mon->outbuf exceeds a limit, suspend reading input.  No response queue.
Would it have a reasonable chance of working?

>> > If we found that it's too hard to enable it by default, I'm thinking
>> > whether we can make it a persistent flag for monitor (maybe turning
>> > the "x-oob" into a real "oob" and keep it, then we don't turn it on by
>> > default), then we can let libvirt start working with out-of-band with
>> > the flag.  After all it's actually working mostly (the pending issues
>> > are only things like flow control for malicious/buggy clients, but
>> > libvirt never had such an issue with it).
>> The OOB job isn't complete without working flow control.  Nevertheless,
>> I'm willing to consider enabling OOB without working flow control.
> That'll be great.  Thanks!
> (Though I think the current OOB series should have addressed all but
>  the out_buf flow control issue, right?)

I hope so, but I haven't been able to review it, yet :)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]