qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Block device size rounding


From: Peter Crosthwaite
Subject: Re: [Qemu-devel] [RFC] Block device size rounding
Date: Fri, 16 Oct 2015 11:10:38 -0700

On Fri, Oct 16, 2015 at 10:04 AM, John Snow <address@hidden> wrote:
>
>
> On 10/14/2015 04:36 AM, Kevin Wolf wrote:
>> Am 13.10.2015 um 17:51 hat John Snow geschrieben:
>>>
>>>
>>> On 10/13/2015 11:30 AM, Peter Crosthwaite wrote:
>>>> On Tue, Oct 13, 2015 at 2:14 AM, Kevin Wolf <address@hidden> wrote:
>>>>> Am 12.10.2015 um 20:26 hat John Snow geschrieben:
>>>>>>
>>>>>>
>>>>>> On 10/12/2015 02:09 PM, Peter Crosthwaite wrote:
>>>>>>> On Mon, Oct 12, 2015 at 9:26 AM, Eric Blake <address@hidden> wrote:
>>>>>>>> On 10/12/2015 09:56 AM, John Snow wrote:
>>>>>>>>
>>>>>>>>>> What is the correct action here though? If the file is writeable 
>>>>>>>>>> should
>>>>>>>>>> we just allow the device to extend its size? Is that possible 
>>>>>>>>>> already?
>>>>>>>>>> Just zero-pad read-only?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Read-only seems like an easy case of append zeroes.
>>>>>>>>
>>>>>>>> Yes, allowing read-only with append-zero behavior seems sane.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Read-write ... well, we can't write-protect just half of a 512k block.
>>>>>>>>
>>>>>>>>> Probably just forcibly increasing the size on RW or refusing to use 
>>>>>>>>> the
>>>>>>>>> file altogether are probably the sane deterministic things we want.
>>>>>>>>
>>>>>>>> I'd lean towards outright rejection if the file size isn't up to snuff
>>>>>>>> for use as read-write.  Forcibly increasing the size (done
>>>>>>>> unconditionally) still feels like magic, and may not be possible if the
>>>>>>>> size is due to something backed by a block device rather than a file.
>>>>>
>>>>> Agreed, let's just reject the image for r/w. Image resize should always
>>>>> been an explicit action invoked by the user, not a side effect of using
>>>>> the image with a specific device.
>>>>>
>>>>>>> Inability to extend is easily detectable and can become a failure mode
>>>>>>> in it's own right. If we cant extend the file perhaps we can just
>>>>>>> LOG_UNIMP the data writes? Having to include in your user instructions
>>>>>>> "dd your already-on-SATA file system to this container just so it can
>>>>>>> work for SD" is a pain.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Peter
>>>>>>>
>>>>>>
>>>>>> Fits within my "Always extend the size" answer. Failing to do so is a
>>>>>> good cause to fail.
>>>>>>
>>>>>> I'm not sure if this is the sort of thing that might require an extra
>>>>>> flag or option for compatibility reasons or not, though. If there is no
>>>>>> precedent for QEMU resizing a block device to make it compatible with a
>>>>>> particular device model, it's probably reasonable that no management
>>>>>> tool is expecting this to happen automatically either.
>>>>>>
>>>>>> Then again, it's still annoying that the current default is definitely
>>>>>> broken.
>>>>>
>>>>> That's not so clear to me. Strictly speaking, this is really a user
>>>>> error because the user passed an image that isn't suitable for the
>>>>> device. All we're discussing is handling this user error friendlier.
>>>>>
>>>>> Maybe we should take a step back: What's the specific use case here,
>>>>> i.e. where does the misaligned image come from and what is it used for?
>>>>
>>>> An ext filesystem image built by the Yocto build system. It is passed
>>>> straight to QEMU as a raw image. The user does not create disk images,
>>>> they are done by the build system. Note that the build system is not
>>>> QEMU specific, it is designed to target either QEMU or be used for
>>>> some form of real-hardware deployment so padding there is
>>>> inappropriate.
>>>>
>>>>> I assume this is not an image created with qemu-img, because then the
>>>>
>>>> I am not using qemu-img at all.
>>>>
>>>>> obvious options would already result in an aligned size.
>>>>>
>>>>
>>>> Maybe. What is the alignment of qemu-img? Note this requires 512K
>>>> alignment, which is kinda huge.
>>>>
>>>>>> I think this is going to boil down into an interface-and-expectations
>>>>>> argument. I am otherwise in favor of just forcing the resize whenever
>>>>>> possible and failing when it isn't.
>>>>>
>>>>> I'm strongly objecting to any automagic resizing of images.
>>>>>
>>>>
>>>> Can we LOG_UNIMP writes to the missing sectors? The the user can RW to
>>>> the in-band sectors which should contain the limit of a pre-existing
>>>> filesystem.
>>>>
>>>
>>> This sounds potentially dangerous. Do we know for sure any data written
>>> here is unimportant?
>>>
>>> If it's all zeroes, we can probably guess it's unimportant. As soon as
>>> any non-zero data lands up in this extension range... how do we assert
>>> that this is garbage?
>>>
>>> I don't think we can...
>>
>> Can we return write errors to the guest? If so, and we know that

It is going to vary from device to device. Basically we need something
in the device spec with the semantics of "Your write failed for an
unknown reason and don't bother retrying".

That said, many devices and drivers need to support the notion of bad
blocks and sectors, so there's a good bet the guest can just handle
corruption. For example, in NAND flash the layout of a bad block is
well defined as a specific data pattern, so for that one we could just
mark these read-only-0 extended sectors as bad and everything is then
the guests problem (as it is now completely valid for the device to
corrupt write data). I wonder if similar mechainisms exist for
everything else?

>> normally the guest shouldn't even try to access the area after the
>> filesystem, it might be reasonable enough to just return write errors in
>> the area that isn't covered by the image. Probably makes the guest
>> unhappy, but it's a bad guest anyway if it tries to write there.

Worthy of a LOG_UNIMP.

>>
>> In that case, the device model should just round up the size, and the
>> block layer will automatically fail anything touching areas beyond the
>> image size.
>>
>> Kevin
>>
>
> Maybe as an option?
>
> This would break re-formatting, right? It might still be nice as a
> low-hassle option, though.

Not if the format process is bad-block tolerant.

Regards,
Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]