qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bitmap migration bug with -drive while block mirror runs


From: Kevin Wolf
Subject: Re: bitmap migration bug with -drive while block mirror runs
Date: Wed, 2 Oct 2019 13:11:47 +0200
User-agent: Mutt/1.12.1 (2019-06-15)

Am 02.10.2019 um 12:46 hat Peter Krempa geschrieben:
> On Tue, Oct 01, 2019 at 12:07:54 -0400, John Snow wrote:
> > 
> > 
> > On 10/1/19 11:57 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > 01.10.2019 17:10, John Snow wrote:
> > >>
> > >>
> > >> On 10/1/19 10:00 AM, Vladimir Sementsov-Ogievskiy wrote:
> > >>>> Otherwise: I have a lot of cloudy ideas on how to solve this, but
> > >>>> ultimately what we want is to be able to find the "addressable" name 
> > >>>> for
> > >>>> the node the bitmap is attached to, which would be the name of the 
> > >>>> first
> > >>>> ancestor node that isn't a filter. (OR, the name of the block-backend
> > >>>> above that node.)
> > >>> Not the name of ancestor node, it will break mapping: it must be name 
> > >>> of the
> > >>> node itself or name of parent (may be through several filters) 
> > >>> block-backend
> > >>>
> > >>
> > >> Ah, you are right of course -- because block-backends are the only
> > >> "nodes" for which we actually descend the graph and add the bitmap to
> > >> its child.
> > >>
> > >> So the real back-resolution mechanism is:
> > >>
> > 
> > Amendment:
> >    - If our local node-name N is well-formed, use this.
> 
> I'd like to re-iterate that the necessity to keep node names same on
> both sides of migration is unexpected, undocumented and in some cases
> impossible.

I think the (implicitly made) requirement is not that all node-names are
kept the same, but only the node-names of those nodes for which
migration transfers some state.

It seems to me that bitmap migration is the first case of putting
something in the migration stream that isn't related to a frontend, but
to the backend, so the usual device hierarchy to address information
doesn't work here. And it seems the implications of this weren't really
considered sufficiently, resulting in the design problem we're
discussing now.

What we need to transfer is dirty bitmaps, which can be attached to any
node in the block graph. If we accept that the way to transfer this is
the migration stream, we need a way to tell which bitmap belongs to
which node. Matching node-name is the obvious answer, just like a
matching device tree hierarchy is used for frontends.

If we don't want to use the migration stream for backends, we would need
to find another way to transfer the bitmaps. I would welcome removing
backend data from the migration stream, but if this includes
non-persistent bitmaps, I don't see what the alternative could be.

> If you want to mandate that they must be kept the same please document
> it and also note the following:
> 
> - during migrations the storage layout may change e.g. a backing chain
>   may become flattened, thus keeping node names stable beyond the top
>   layer is impossible

You don't want to transfer bitmaps of nodes that you're going to drop.
I'm not an expert for these bitmaps, but I think this just means you
would have to disable any bitmaps on the backing files to be dropped on
the source host before you migrate.

> - in some cases (readonly image in a cdrom not present on destination,
>   thus not relevant here probably) it may even become impossible to
>   create any node thus keeping the top node may be impossible

Same thing, you don't want to transfer a bitmap for a node that
disappears.

> - it should be documented when and why this happens and how management
>   tools are supposed to do it
> 
> - please let me know what's actually expected, since libvirt
>   didn't enable blockdev yet we can fix any unexpected expectations
> 
> - Document it so that the expectations don't change after this.

Yes, we need a good and ideally future-proof rule of which node-names
need to stay the same. Currently it's only bitmaps, but might we get
another feature later where we want to transfer more backend data?

> - Ideally node names will not be bound to anything and freely
>   changeable. If necessary we can provide a map to qemu during migration
>   which is probably less painful and more straightforward than keeping
>   them in sync somehow ...

A map feels painful for the average user (and for the QEMU
implementation), even if it looks convenient for libvirt. If anything,
I'd make it optional and default to 1:1 mappings for anything that isn't
explicitly mapped.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]