[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] RFC: use case for adding QMP, block jobs &

From: Eric Blake
Subject: Re: [Qemu-devel] [Qemu-block] RFC: use case for adding QMP, block jobs & multiple exports to qemu-nbd ?
Date: Thu, 2 Nov 2017 12:50:39 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 11/02/2017 12:04 PM, Daniel P. Berrange wrote:

> vm-a-disk1.qcow2 open - its just a regular backing file setup.
>>>         |  (format=qcow2, proto=file)
>>>           |
>>>           +-  vm-a-disk1.qcow2  (qemu-system-XXX)
>>> The problem is that many VMs are wanting to use cache-disk1.qcow2 as
>>> their disk's backing file, and only one process is permitted to be
>>> writing to disk backing file at any time.
>> Can you explain a bit more about how many VMs are trying to write to
>> write to the same backing file 'cache-disk1.qcow2'?  I'd assume it's
>> just the "immutable" local backing store (once the previous 'mirror' job
>> is completed), based on which Nova creates a qcow2 overlay for each
>> instance it boots.
> An arbitrary number of  vm-*-disk1.qcow2 files could exist all using
> the same cache-disk1.qcow2 image. Its only limited by how many VMs
> you can fit on the host. By definition you can only ever have a single
> process writing to a qcow2 file though, otherwise corruption will quickly
> follow.

So if I'm following, your argument is that the local qemu-nbd process is
the only one writing to the file, while all other overlays are backed by
the NBD process; and then as any one of the VMs reads, the qemu-nbd
process pulls those sectors into the local storage as a result.

>> When I pointed this e-mail of yours to Matt Booth on Freenode Nova IRC
>> channel, he said the intermediate image (cache-disk1.qcow2) is a COR
>> Copy-On-Read).  I realize what COR is -- everytime you read a cluster
>> from the backing file, you write that locally, to avoid reading it
>> again.
> qcow2 doesn't give you COR, only COW. So every read request would have a miss
> in cache-disk1.qcow2 and thus have to be fetched from master-disk1.qcow2. The
> use of drive-mirror to pull master-disk1.qcow2 contents into cache-disk1.qcow
> makes up for the lack of COR by populating cache-disk1.qcow2 in the 
> background.

Ah, but qcow2 (or more precisely, any protocol qemu BDS) DOES have
copy-on-read, built in to the block layer.  See qemu-iotest 197 for an
example of it in use.  If we use COR correctly, then every initial read
request will miss in the cache, but the COR will populate the cache
without having to have a background drive-mirror.  A background
drive-mirror may still be useful to populate the cache faster, but COR
populates the parts you want now regardless of how fast the background
task is running.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]