[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bitmap migration bug with -drive while block mirror runs

From: Peter Krempa
Subject: Re: bitmap migration bug with -drive while block mirror runs
Date: Tue, 1 Oct 2019 06:28:57 +0200
User-agent: Mutt/1.12.1 (2019-06-15)

On Mon, Sep 30, 2019 at 20:09:28 -0400, John Snow wrote:
> Hi folks, I identified a problem with the migration code that Red Hat QE
> found and thought you'd like to see it:
> https://bugzilla.redhat.com/show_bug.cgi?id=1652424#c20
> Very, very briefly: drive-mirror inserts a filter node that changes what
> bdrv_get_device_or_node_name() returns, which causes a migration problem.
> Ignorant question #1: Can we multi-parent the filter node and
> source-node? It looks like at the moment both consider their only parent
> to be the block-job and don't have a link back to their parents otherwise.
> Otherwise: I have a lot of cloudy ideas on how to solve this, but
> ultimately what we want is to be able to find the "addressable" name for
> the node the bitmap is attached to, which would be the name of the first
> ancestor node that isn't a filter. (OR, the name of the block-backend
> above that node.)

One possibility if there isn't an elegant qemu-based solution would be
to add a migration feature libvirt could enable which would simply stop
bitmaps from being copied and libvirt would do that in the synchronised
phase of the migration explicitly.

Libvirt might possibly need to do it anyways for inactive bitmaps if
the automatic bitmap copying includes only active bitmaps.

I'm not sure though how that would combine with post-copy migration or
what the impact on latency would be, but if you are migrating with
storage I think performance will not be stelar anyways.

> A simple way to do this might be a "child_unfiltered" BdrvChild role
> that simply bypasses the filter that was inserted and serves no real
> purpose other than to allow the child to have a parent link and find who
> it's """real""" parent is.
> Because of flushing, reopen, sync, drain &c &c &c I'm not sure how
> feasible this quick idea might be, though.
> - Corollary fix #1: call error_setg if the bitmap node name that's about
> to go over the wire is an autogenerated node: this is never correct!
> (Why not? because the target is incapable of matching the node-name
> because they are randomly generated AND you cannot specify node-names
> with # prefixes as they are especially reserved!
> (This raises a related problem: if you explicitly add bitmaps to nodes
> with autogenerated names, you will be unable to migrate them.))

I think this should be okay. In libvirt I opted to forbid checkpoints
which map to bitmap creation until blockdev will be supported where we
manage node names ourselves.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]