qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, Q


From: Kevin Wolf
Subject: Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
Date: Thu, 17 Jun 2010 16:15:57 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-2.fc12 Thunderbird/3.0.4

Am 17.06.2010 15:01, schrieb Anthony Liguori:
> On 06/17/2010 03:20 AM, Kevin Wolf wrote:
>> Am 16.06.2010 20:07, schrieb Anthony Liguori:
>>    
>>>>    But it requires that
>>>> everything that -blockdev provides is accessible with -drive, too (or
>>>> that we're okay with users hating us).
>>>>
>>>>        
>>> I'm happy for -drive to die.  I think we should support -hda and
>>> -blockdev.
>>>      
>> -hda is not sufficient for most users. It doesn't provide any options.
>> It doesn't even support virtio. If -drive is going to die (and we seem
>> to agree all on that), then -blockdev needs to be usable for users (and
>> it's only you who contradicts so far).
>>    
> 
> I've always thought we should have a -vda argument and an -sda argument 
> specifically for specifying virtio and scsi disks.

It would at least fix the most obvious problem. However, it still
doesn't allow passing options.

>>> -blockdev should be optimized for config files, not single
>>> argument input.  IOW:
>>>
>>> [blockdev "blk2"]
>>>    format = "raw"
>>>    file = "/path/to/base.img"
>>>    cache = "writeback"
>>>
>>> [blockdev "blk1"]
>>>     format = "qcow2"
>>>     file = "/path/to/leaf.img"
>>>     cache="off"
>>>     backing_dev = "blk2"
>>>
>>> [device "disk1"]
>>>     driver = "ide-drive"
>>>     blockdev = "blk1"
>>>     bus = "0"
>>>     unit = "0"
>>>      
>> You don't specify the backing file of an image on the command line (or
>> in the configuration file).
> 
> But we really ought to allow it.  Backing files are implemented as part 
> of the core block layer, not the actual block formats.  

The generic block layer knows the name of the backing file, so it can be
displayed in tools, but that's about it. Calling this the
"implementation" of backing files is daring.

I see no use case for specifying it on the command line. The only thing
you can achieve better with it is corrupting your image because you
specify the wrong/no backing file next time.

> Today the block 
> layer queries the block format for the name of the backing file but gets 
> no additional options from the block format.  File isn't necessarily 
> enough information to successfully open the backing device so why treat 
> it specially?
> 
> I think we should keep the current ability to query the block format for 
> a backing file name but we should also support hooking up the backing 
> device without querying the block format at all.  It makes the model 
> much more elegant IMHO because then we're just creating block devices 
> and hooking them up.  All block devices are created equal more or less.
> 
>>   It's saved as part of the image. It's more
>> like this (for a simple raw image file):
>>
>> [blockdev-protocol "proto1"]
>>     protocol = "file"
>>     file = "/path/to/image.img"
>>
>> [blockdev "blk1"]
>>     format = "raw"
>>     cache="off"
>>     protocol = "proto1"
>>
>> [device "disk1"]
>>     driver = "ide-drive"
>>     blockdev = "blk1"
>>     bus = "0"
>>     unit = "0"
>>
>> (This would be Markus' option 3, I think)
>>    
> 
> I don't understand why we need two layers of abstraction here.  Why not 
> just:
> 
> [blockdev "proto1"]
>    protocol = "file"
>    cache = "off"
>    file = "/path/to/image.img"
> 
> Why does the cache option belong with raw and not with file and why 
> can't we just use file directly?

The cache option is shared along the chain, so it probably fits best in
the blockdev.

And we don't use file directly because it's wrong. Users say that their
image is in raw format, and they don't get why they should have to make
a difference between a raw image stored on a block device and one stored
in a file.

> As Christoph mentions, we really don't 
> have stacked protocols and I'm

The only question is if we call them stacked formats or stacked
protocols. One of them exists.

>>> not sure they make sense.
>>>      
>> Right, if we go for Christoph's suggestion, we don't need stacked
>> protocols. We'll have stacked formats instead. I'm not sure if you like
>> this any better. ;-)
>>
>> We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
>> blkdebug on file. We need to be able to represent this.
>>    
> 
> I think we need to do stacking in a device specific way.  When you look 
> at something like vmdk, it should actually support multiple leafs since 
> the format does support such a thing.  So what I'd suggest is:
> 
> [blockdev "part1"]
>    format = "raw"
>    file = "image000.vmdk"
> 
> [blockdev "part2"]
>    format = "raw"
>    file = "image001.vmdk"
> 
> [blockdev "image"]
>    format = "vmdk"
>    section0 = "part1"
>    section1 = "part2"

Actually, I'd prefer to read that information from the VMDK file instead
of requiring the user to configure this manually...

> Note, we'll need to support this sort of model in order to support a 
> disk that creates an automatic partition table (which would be a pretty 
> useful feature). 

Sounds like a good example of a useful protocol.

Markus, I'm afraid we've found an equivalent to Avi's mirror. If not
even more complicated, because we'd need to accept any length for the
list of partitions - possibly an option that should take an array?

> For blkdebug, it would look like:
> 
> [blockdev "disk"]
>    format = "qcow2"
>    file = "foo.qcow2"
> 
> [blockdev "debug"]
>    format = "blkdebug"
>    blockdev = "disk"

Something's wrong here. You end up with a chain qcow2 -> blkdebug ->
file, where file needs the file name. So format = "qcow2" and file =
"foo.qcow" can't belong to the same section.

I'm not sure how you'd build this from your description and which one
should be the blockdev that is attached to a guest device (this is what
is made much clearer if you distinguish between blockdev and
blockdev-protocol)

>>> I think raw doesn't make very much sense then.  What's the point of it
>>> if it's just a thin wrapper around a protocol?
>>>      
>> That it can be wrapped around any protocol. It's just about separating
>> code for handling the content of an image and code for accessing the image.
>>
>> Ever tried something like "qemu-img create -f raw /dev/something 10G"?
>> You need the host_device protocol there, not the file protocol. When we
>> had raw == file this completely failed. And it's definitely reasonable
>> to expect that it works because the image format _is_ raw, it's just not
>> saved in a file.
> 
> No, I don't actually thing it's reasonable.  There's nothing meaningful 
> that command can do. 

Except not adding special cases.

And I can only repeat it once more: Users don't understand why they need
to tell qemu that they are working on a block device. qemu can know this
very well by itself.

> Also, I've never understand creating qcow2 images 
> on a physical device.  qcow2 needs to grow dynamically and physical 
> devices can't.

We can discuss this over and over again, but it's used and we can't
ignore that.

> I understand that we need to support the later use case but I don't 
> think creating this layer of user-visible abstraction is the right thing 
> to do.  This is an obscure use case and it shouldn't be the model that 
> we force upon our users.

No, you're trying to force your model upon our users, which they have
made very clear in bug reports that they don't understand. The whole
point of the abstraction is that the internal code organization isn't
visible to the user any more (and of course that it makes the code nicer
as a consequence).

On the other hand, what does having raw == file buy us? Have you heard
of any problems that were caused by the change? (Other than the block
driver whitelist which needed to be changed in the configure call if you
were using it)

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]