Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in differ

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in differ

From:	Denis Plotnikov
Subject:	Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext
Date:	Tue, 19 May 2020 16:54:18 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1



On 19.05.2020 15:32, Vladimir Sementsov-Ogievskiy wrote:

14.05.2020 17:26, Kevin Wolf wrote:

Am 14.05.2020 um 15:21 hat Thomas Lamprecht geschrieben:

On 5/12/20 4:43 PM, Kevin Wolf wrote:
Stefan (Reiter), after looking a bit closer at this, I think thereis no
bug in QEMU, but the bug is in your coroutine code that calls block
layer functions without moving into the right AioContext first. I've
written this series anyway as it potentially makes the life of callers
easier and would probably make your buggy code correct.
However, it doesn't feel right to commit something like patch 2without
having a user for it. Is there a reason why you can't upstream your
async snapshot code?
I mean I understand what you mean, but it would make the interfaceIMO somuch easier to use, if one wants to explicit schedule it beforehandtheycan still do. But that would open the way for two styles doingthings, notsure if this would seen as bad. The assert about from patch 3/3would be
already really helping a lot, though.


I think patches 1 and 3 are good to be committed either way if people
think they are useful. They make sense without the async snapshot code.

My concern with the interface in patch 2 is both that it could give
people a false sense of security and that it would be tempting to write
inefficient code.

Usually, you won't have just a single call into the block layer for a
given block node, but you'll perform multiple operations. Switching to
the target context once rather than switching back and forth in every
operation is obviously more efficient.

But chances are that even if one of these function is bdrv_flush(),
which now works correctly from a different thread, you might need
another function that doesn't implement the same magic. So you always
need to be aware which functions support cross-context calls which
ones don't.

I feel we'd see a few bugs related to this.

Regarding upstreaming, there was some historical attempt to upstream it
from Dietmar, but in the time frame of ~ 8 to 10 years ago or so.
I'm not quite sure why it didn't went through then, I see if I can get
some time searching the mailing list archive.

We'd be naturally open and glad to upstream it, what it effectively
allow us to do is to not block the VM to much during snapshoting it
live.


Yes, there is no doubt that this is useful functionality. There has been
talk about this every now and then, but I don't think we ever got to a
point where it actually could be implemented.

Vladimir, I seem to remember you (or someone else from your team?) were
interested in async snapshots as well a while ago?


Den is working on this (add him to CC)

Yes, I was working on that.

What I've done can be found here:https://github.com/denis-plotnikov/qemu/commits/bgs_uffd

The idea was to save a snapshot (state+ram) asynchronously to a separate(raw) file using the existing infrastructure.

The goal of that was to reduce the VM downtime on snapshot.

We decided to postpone this work until "userfaultfd write protectedmode" wasn't in the linux mainstream.Now, userfaultfd-wp is merged to linux. So we have plans to continuethis work.

According to the saving the "internal" snapshot to qcow2 I still have aquestion. May be this is the right place and time to ask.

If I remember correctly, in qcow2 the snapshot is stored at the end ofthe address space of the current block-placement-table.We switch to the new block-placement-table after the snapshot storing iscomplete. In case of sync snapshot, we should switch thetable before the snapshot is written, another words, we should be ableto preallocate the the space for the snapshot and keep a link

to the space until snapshot writing is completed.

The question is whether it could be done without qcow2 modification andif not, could you please give some ideas of how to implement that?


Denis

I pushed a tree[0] with mostly just that specific code squashedtogether (hope
I did not break anything), most of the actual code is in commit [1].
It'd be cleaned up a bit and checked for coding style issues, butworks good
here.

Anyway, thanks for your help and pointers!

[0]: https://github.com/ThomasLamprecht/qemu/tree/savevm-async
[1]:https://github.com/ThomasLamprecht/qemu/commit/ffb9531f370ef0073e4b6f6021f4c47ccd702121


It doesn't even look that bad in terms of patch size. I had imagined it
a bit larger.

But it seems this is not really just an async 'savevm' (which would save
the VM state in a qcow2 file), but you store the state in a separate
raw file. What is the difference between this and regular migration into
a file?

I remember people talking about how snapshotting can store things in a
way that a normal migration stream can't do, like overwriting outdated
RAM state instead of just appending the new state, but you don't seem to
implement something like this.

Kevin

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [RFC PATCH 1/3] block: Factor out bdrv_run_co(), (continued)
- [RFC PATCH 2/3] block: Allow bdrv_run_co() from different AioContext, Kevin Wolf, 2020/05/12
  - Re: [RFC PATCH 2/3] block: Allow bdrv_run_co() from different AioContext, Thomas Lamprecht, 2020/05/12
    - Re: [RFC PATCH 2/3] block: Allow bdrv_run_co() from different AioContext, Kevin Wolf, 2020/05/12
- [RFC PATCH 3/3] block: Assert we're running in the right thread, Kevin Wolf, 2020/05/12
  - Re: [RFC PATCH 3/3] block: Assert we're running in the right thread, Stefan Reiter, 2020/05/14
    - Re: [RFC PATCH 3/3] block: Assert we're running in the right thread, Kevin Wolf, 2020/05/14
- Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Thomas Lamprecht, 2020/05/14
  - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Kevin Wolf, 2020/05/14
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Vladimir Sementsov-Ogievskiy, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Denis Plotnikov <=
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Kevin Wolf, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Denis Plotnikov, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Kevin Wolf, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Vladimir Sementsov-Ogievskiy, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Eric Blake, 2020/05/19
    - Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext, Denis Plotnikov, 2020/05/20

Prev by Date: [PATCH] io_uring: use io_uring_cq_ready() to check for ready cqes
Next by Date: Re: [PATCH v2 5/9] block/io: expand in_flight inc/dec section: simple cases
Previous by thread: Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext
Next by thread: Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext
Index(es):
- Date
- Thread