qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type
Date: Thu, 23 May 2019 09:52:21 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> Hi
>
> On Tue, May 21, 2019 at 4:18 PM Markus Armbruster <address@hidden> wrote:
>>
>> Marc-André, before you invest your time to answer my questions below: my
>> bandwidth for non-trivial QAPI features like this one is painfully
>> limited.  To get your QAPI conditionals in, I had to postpone other QAPI
>> projects.  I don't regret doing that, I'm rather pleased with how QAPI
>> conditionals turned out.  But I can't keep postponing all other QAPI
>> projects.  Because of that, this one will be slow going, at best.  Sorry
>> about that.
>
> We have different priorities, fair enough.

I wish I could give you better service.  But no use pretending.

>> Marc-André Lureau <address@hidden> writes:
>>
>> > Hi,
>> >
>> > HMP and QMP commands are handled synchronously in qemu today. But
>> > there are benefits allowing the command handler to re-enter the main
>> > loop if the command cannot be handled synchronously, or if it is
>> > long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
>> > without it.
>> >
>> > The common solution is to use a pair of command+event in this case.
>>
>> In particular, background jobs (qapi/jobs.json).  They grew out of block
>> jobs, and are still used only for "blocky" things.  Using them more
>> widely would probably make sense.
>>
>> > But this approach has a number of issues:
>> > - you can't "fix" an existing command: you need a new API, and ad-hoc
>> >   documentation for that command+signal association, and old/broken
>> >   command deprecation
>>
>> Making a synchronous command asynchronous is an incompatible change.  We
>> need to let the client needs opt in.  How is that done in this series?
>
> No change visible on client side. I dropped the async command support
> a while ago already, based on your recommendations. I can dig the
> archive for the discussion if necessary.

Not right now.

>> > - since the reply event is broadcasted and 'id' is used for matching the
>> >   request, it may conflict with other clients request 'id' space
>>
>> Any event that does that now is broken and needs to be fixed.  The
>> obvious fix is to include a monitor ID with the command ID.  For events
>> that can only ever be useful in the context of one particular monitor,
>> we could unicast to that monitor instead; see below.
>>
>> Corollary: this is just a fixable bug, not a fundamental advantage of
>> the async feature.
>
> I am just pointing out today drawbacks of turning a function async by
> introducing new commands and signals.

And I'm just pointing out that some of today's drawbacks could also be
addressed differently :)

>> > - it is arguably less efficient and elegant (weird API, useless return
>> >   in most cases, broadcast reply, no cancelling on disconnect etc)
>>
>> The return value is useful for synchronously reporting failure to start
>> the background task.  I grant you that background tasks may exist that
>> won't ever fail to start.  I challenge the idea that it's most of them.
>>
>> Broadcast reply could be avoided by unicasting events.  If I remember
>> correctly, Peter Xu even posted patches some time ago.  We ended up not
>> using them, because we found a better solution for the problem at hand.
>> My point is: this isn't a fundamental problem, it's just the way we
>> coded things up.
>>
>> What do you mean by "no cancelling on disconnect"?
>
> When the client disconnects, the background task keeps running, and
> there is no simple way to know about that event afaik. My proposal has
> a simple API for that (see "qmp: add qmp_return_is_cancelled()"
> patch).

Auto-cancellation on client disconnect may be exactly what's wanted for
simple use cases.

Jobs are designed with more use cases in mind.  Consider a backup job
that's take some time.  We certainly don't want to cancel it just
because the management application hiccups and disconnects.  Instead, we
want to permit the management application to recover, reconnect, find
the backup job, examine its state, and resume managing it.  To support
this, jobs have a unique ID.  Job cancellation is explicit.

Jobs could acquire a "auto-cancel on disconnect" feature if there's a
need.

I'm not sure how asynchronous commands could support reconnect and
resume.

>> I'm ignoring "etc" unless you expand it into something specific.
>>
>> I'm also not taking the "weird" bait :)
>> > The following series implements an async command solution instead. By
>> > introducing a session context and a command return handler, it can:
>> > - defer the return, allowing the mainloop to reenter
>> > - return only to the caller (instead of broadcast events for reply)
>> > - optionnally allow cancellation when the client is gone
>> > - track on-going qapi command(s) per client/session
>> >
>> > and without introduction of new QMP APIs or client visible change.
>>
>> What do async commands provide that jobs lack?
>>
>> Why do we want both?
>
> They are different things, last we discussed it: jobs are geared
> toward block device operations,

Historical accident.  We've discussed using them for non-blocky stuff,
such as migration.  Of course, discussions are cheap, code is what
counts.

>                                 and do not provide simple qmp-level
> facilities that I listed above. What I introduce is a way for an
> *existing* QMP command to be splitted, so it can re-enter the main
> loop sanely (and not by introducing new commands or signals or making
> things unnecessarily more complicated).
>
> My proposal is fairly small:
>   27 files changed, 877 insertions(+), 260 deletions(-)
>
> Including test, and the qxl screendump fix, which account for about
> 1/3 of the series.
>
>> I started to write a feature-by-feature comparison, but realized I don't
>> have the time to figure out either jobs or async from their (rather
>> sparse) documentation, let alone from code.
>>
>> > Existing qemu commands can be gradually replaced by async:true
>> > variants when needed, while carefully reviewing the concurrency
>> > aspects. The async:true commands marshaller helpers are splitted in
>> > half, the calling and return functions. The command is called with a
>> > QmpReturn context, that can return immediately or later, using the
>> > generated return helper, which allows for a step-by-step conversion.
>> >
>> > The screendump command is converted to an async:true version to solve
>> > rhbz#1230527. The command shows basic cancellation (this could be
>> > extended if needed). It could be further improved to do asynchronous
>> > IO writes as well.
>>
>> What is "basic cancellation"?
>> What extension(s) do you have in mind?
>
> It checks for cancellation in a few places, between IO. Full
> cancellation would allow to cancel at any time.
>
>>
>> What's the impact of screendump writing synchronously?
>
> It can be pretty bad, think about 4k screens. It is 33177600 bytes,
> written in PPM format, blocking the main loop..

My question was specifically about "could be further improved to do
asynchronous IO writes as well".  What's the impact of not having this
improvement?  I *guess* it means that even with the asynchronous
command, the synchronous writes still block "something", but I'm not
sure what "something" may be, and how it could impact behavior.  Hence
my question.

> QMP operation doing large IO (dumps), or blocking on events, could be
> switched to this async form without introducing user-visible change,

Letting the next QMP command start before the current one is done is a
user-visible change.  We can discuss whether the change is harmless.

> and with minimal effort compared to jobs.

To gauge the difference in effort, we'd need actual code to compare.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]