qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command


From: Federico Simoncelli
Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
Date: Tue, 13 Mar 2012 20:14:16 -0400 (EDT)

----- Original Message -----
> From: "Eric Blake" <address@hidden>
> To: "Paolo Bonzini" <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden, address@hidden, 
> address@hidden
> Sent: Tuesday, March 13, 2012 9:48:10 PM
> Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
> 
> On 03/06/2012 10:56 AM, Paolo Bonzini wrote:
> > From: Federico Simoncelli <address@hidden>
> > 
> > Signed-off-by: Federico Simoncelli <address@hidden>
> > Signed-off-by: Paolo Bonzini <address@hidden>
> 
> >  ##
> > +# @drive-reopen
> > +#
> > +# Assigns a new image file to a device.
> > +#
> > +# @device: the name of the device for which we are changing the
> > image file.
> > +#
> > +# @new-image-file: the target of the new image. If the file
> > doesn't exists the
> > +#                  command will fail.
> > +#
> > +# @format: #optional the format of the new image, default is
> > 'qcow2'.
> > +#
> > +# Returns: nothing on success
> > +#          If @device is not a valid block device, DeviceNotFound
> > +#          If @new-image-file can't be opened, OpenFileFailed
> > +#          If @format is invalid, InvalidBlockFormat
> > +#
> > +# Since 1.1
> > +##
> > +{ 'command': 'drive-reopen',
> > +  'data': { 'device': 'str', 'new-image-file': 'str', '*format':
> > 'str' } }
> 
> I still think we need a 'drive-reopen' action included in
> 'transaction',
> as an 11/10 on this series.  For disk migration, it is true that you
> can
> migrate one disk at a time, and therefore only need to reopen one
> disk
> at a time, to get the guarantee that for a single disk image, the
> current state of that image will be guaranteed to be consistent using
> only one storage domain.

I'm not sure if this was already addressed on this mailing list but
the main problem is that as general rule a qcow file cannot be opened
in r/w mode twice. I believe there was only one exception to that
with the live migration and it generated several issues.

That said, reopen is really hard to be implemented as a transaction
without breaking that rule. For example in the blkmirror case you'd
need to open the destination image in r/w while the mirroring is in
action (already having the same image in r/w mode).

There are several solutions here but they are either really hard to
implement or non definitive. For example:

* We could try to implement the reopen command for each special case,
  eg: blkmirror, reopening the same image, etc... and in such cases
  reusing the same bs that we already have. The downside is that this
  command will be coupled with all this special cases.

* We could use the transaction APIs without actually making it
  transaction (if we fail in the middle we can't rollback). The only
  advantage of this is that we'd provide a consistent API to libvirt
  and we would postpone the problem to the future. Anyway I strongly
  discourage this as it's completely unsafe and it's going to break
  the transaction semantic. Moreover it's a solution that relies too
  much on the hope of finding something appropriate in the future.

* We could leave it as it is, a distinct command that is not part of
  the transaction and that it's closing the old image before opening
  the new one.

> But since the API allows the creation of two mirrors in one command,
> I'm
> worried that someone will try to start a mirror on two disks at once,
> but then be stuck doing two separate 'drive-reopen' commands.  If the
> first succeeds but the second fails, you have now stranded the qemu
> process across two storage domains, which is exactly what we were
> trying
> to avoid in the first place by inventing transactions.  That is, even
> if
> all disks are individually consistent in a single domain, the act of
> migrating then reopening one disk at a time means you will have a
> window
> where disk 1 and disk 2 are opened on different storage domains.

This is not completely correct, the main intent was to not spread one
image chain across two storage domains (making it incomplete if one of
them was missing). In the next oVirt release a VM can have different
disks on different storage domains, so this wouldn't be a special case
but just a normal situation.

-- 
Federico



reply via email to

[Prev in Thread] Current Thread [Next in Thread]