qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bitmap migration bug with -drive while block mirror runs


From: John Snow
Subject: Re: bitmap migration bug with -drive while block mirror runs
Date: Wed, 2 Oct 2019 17:35:24 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0

On 10/2/19 6:46 AM, Peter Krempa wrote:

[ * poof * ]

> 
> I'd like to re-iterate that the necessity to keep node names same on
> both sides of migration is unexpected, undocumented and in some cases
> impossible.
> 
> If you want to mandate that they must be kept the same please document
> it and also note the following:
> 
> - during migrations the storage layout may change e.g. a backing chain
>   may become flattened, thus keeping node names stable beyond the top
>   layer is impossible
> 

The struct and layout of the graph is entirely unrelated to the
requirement that a bitmap attached to a node with a name on the source
needs to have a node with the same name on the destination. It's an
addressability requirement only.

Change it entirely to a new drive if you want, move it up or down the
graph, it doesn't matter.

Libvirt is in the best position to understand where bitmaps already are
and where it wants them to go.

> - in some cases (readonly image in a cdrom not present on destination,
>   thus not relevant here probably) it may even become impossible to
>   create any node thus keeping the top node may be impossible
> 

It's not mandatory to recreate the graph exactly. Consider what you are
saying:

- Libvirt adds a bitmap to node "N"
- Libvirt asks QEMU for bitmap migration
- Libvirt migrates to a QEMU instance that not only does not have a node
"N", but has no analogous node at all!

I believe this is right to fail as there is no way to fulfill the
request as-is.

(More below if you feel it's valid to migrate only some bitmaps.)

> - it should be documented when and why this happens and how management
>   tools are supposed to do it
> 

OK, agreed, and I am sorry that our existing story has been hand-wavey.

Let me tell you the exact specifics of the current broken logic so you
can understand the requirements as they exist right now.

1. Bitmaps attempt to use their device name to migrate, if available.
This covers 99% of use cases where a bitmap was added to a node that was
attached directly to a device model.

This includes almost all usual cases: bitmaps loaded from qcow2 files,
bitmaps added via QMP to a root node, bitmaps added via QMP to a drive name.

(It does not include cases where bitmaps are intentionally added to
nodes that aren't a device root. Libvirt, I believe, can simply never do
this and it will never come up.)

2. If a device name isn't available because this bitmap is not attached
to a root OR the BB does not have a name, we migrate using the node name.

3. No attention is paid whatsoever to whether a node name is
automatically generated or not. In effect, if the device name lookup
fails we currently "trust" that the node name is something meaningful.

4. The bug as I originally perceived of it relates specifically to our
failure to resolve the device name after graph manipulations.


Under these rules, if we "fixed" #4, node-names wouldn't show up in the
stream at all if you never attached a bitmap to a non-root node. This is
probably what you expected.

Node-names only feature in cases where we can't find a device/drive
name, which is:
A. When a bitmap is attached to a non-root node specifically. Libvirt
can simply never do this!
B. When under a graph transformation for drive-mirror; point #4 above.


The workaround for this bug if we don't find a good policy:

1. Use blockdev.
2. Give explicit, semantic names to the root nodes that represent the drive.
3. Any name used to add a bitmap must appear on the destination in a
migration.


> - please let me know what's actually expected, since libvirt
>   didn't enable blockdev yet we can fix any unexpected expectations
> 

I have been and will continue to be diligent in CCing you and libvirt list.

At the moment I am still leaning towards the idea that libvirt should
expect that any bitmaps attached to a node with an explicit node-name
will want to use those names to migrate, but that we might be able to
limit the cases such that you will be able to avoid the circumstance
entirely.

However, QEMU's actual implementation is that they are node object. QEMU
is ill-equipped to make semantic decisions about what the bitmaps "mean"
or "represent"; the name is unfortunately the most explicit identifier
we have to convey what bitmap we are talking about.

It will be libvirt's job to use node names to help facilitate QEMU's
transfer of these objects during migration in a semantically helpful way.

> - Document it so that the expectations don't change after this.
> 

OK. I will take charge on this, once we reach a consensus.

> - Ideally node names will not be bound to anything and freely
>   changeable. If necessary we can provide a map to qemu during migration
>   which is probably less painful and more straightforward than keeping
>   them in sync somehow ...

Why do you want node names to be freely manipulable?

The only constraint we've actually added is that a root node (that has a
bitmap) attached to a device needs to have a name that is available on
the target.

(Oh, and, that the virtual size of that target matches the source.)

> 



Phew. In terms of non-direct replies to Peter's questions above, I've
written out like a dozen failed replies to this, so I'm still quite
confused but need to work on other things today.

I currently think that:


1. If a user uses block-dirty-bitmap-add, we have some sense of where
they wanted the bitmap to go in the graph because they specified a name.
Migration, if left as an automatic (opt-in) process, should try to
migrate in-kind:

- If the user used a drive name, try to use a drive name to migrate. If
there is no drive name and our node name is autogenerated, we cannot
migrate this bitmap.
- If the user used an explicit, non-generated node name, use the node
name. If the user used an implicit node-name, we need to try to resolve
the device name again. If that's not possible, the bitmap cannot be
migrated.


This implies that QEMU will try to "guess" where bitmaps go when using
-drive/-device, but will rely on explicit configuration when using
blockdev. I think the spirit of this idea is correct.

(Vladimir: this is indeed different from EITHER of my suggested
resolution orders over the last two days.)



2. I like Vladimir's idea of providing a "default" migration approach,
but allowing libvirt to override some features of it if necessary.

Overrides that I think will be helpful in alleviating any pain in the
long term:

- Whitelists / Blacklists

The ability to provide either a whitelist or a blacklist for bitmaps
that we desire to migrate. The default can continue to be: "All bitmaps
with a name." This will allow libvirt to drop bitmaps at its discretion
if it performs a block graph reconfiguration on migration and the bitmap
is no longer semantically relevant or appropriate for whatever reason.
This is superior to explicitly deleting bitmaps or dropping nodes in
order to have a valid recourse on failed migrations.


- The ability to override specific mappings on an as-needed basis. I
believe the default resolution mechanism should be one that behaves like
I specify above; but if that resolution is untenable for some reason,
you can specify a remapping if you really require.

I am actually hoping that remapping is actually not necessary, because I
think it's sufficient to use node-names to explicitly direct bitmaps to
their intended targets.

But if we truly do need that power, I'm open to providing an interface
to specify it.



I hope everyone is as confused as I am, now.
--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]