qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option
Date: Thu, 16 Aug 2018 04:49:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 2018-08-16 04:39, Eric Blake wrote:
> On 08/15/2018 09:20 PM, Max Reitz wrote:
>> On 2018-08-15 04:56, Eric Blake wrote:
>>> For feature parity with dd, we want to be able to specify
>>> the offset within the output file, just as we can specify
>>> the offset for the input (in particular, this makes copying
>>> a subset range of guest-visible bytes from one file to
>>> another much easier).
>>
>> In my opinion, we do not want feature parity with dd.  What we do want
>> is feature parity with convert.
> 
> Well, convert is lacking a way to specify a subset of one file to move
> to a (possibly different) subset of the other.  I'm fine if we want to
> enhance convert to do the things that right now require a dd-alike
> interface (namely, limiting the copying to less than the full file, and
> choosing the offset at which to start [before this patch] or write to
> [with this patch]).

Yes, I would want that.

> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
> a proper deprecation period.

Technically it has those features already, with the raw block driver's
offset and size parameters.

>>> The code style for 'qemu-img dd' was pretty hard to read;
>>> unfortunately this patch focuses only on adding the new
>>> feature in the existing style rather than trying to improve
>>> the overall flow, other than switching octal constants to
>>> hex.  Oh well.
>>
>> No, the real issue is that dd is still not implemented just as a
>> frontend to convert.  Which it should be.  I'm not sure dd was a very
>> good idea from the start, and now it should ideally be a frontend to
>> convert.
>>
>> (My full opinion on the matter: dd has a horrible interface.  I don't
>> quite see why we replicated that inside qemu-img.  Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
> 
> Because of performance: qemu-nbd + Linux nbd device + real dd is one
> more layer of data copying (each write() from dd goes to kernel, then is
> sent to qemu-nbd in userspace as a socket message before being sent back
> to the kernel to actually write() to the final destination) compared to
> just doing it all in one process (write() lands in the final destination
> with no further user space bouncing).  And because the additional steps
> to set it up are awkward (see my other email where I rant about losing
> the better part of today to realizing that 'dd ...; qemu-nbd -d
> /dev/nbd1' loses data if you omit conv=fdatasync).

I can see the sync problems, but is the performance really that much worse?

>> ((That gave me a good idea.  Actually, it's probably not such a good
>> idea, but I guess I'll do it in my spare time anyway.  A qemu-img fuse
>> might be nice which represents an image as a raw image at some mount
>> point.  Benefits over qemu-nbd: (1) You don't need root, (2) you don't
>> need to type modprobe nbd.))
> 
> So the kernel->userspace translation would be happening via the FUSE
> interface instead of the NBD interface.  Data still bounces around just
> as much, but it might be a fun project.  Does fuse behave well when
> serving exactly one file at the mountpoint, rather than the more typical
> file system rooted in a directory?  NBD at least has the benefit of
> claiming to be a block device all along, rather than complicating the
> user interface with POSIX file system rules (which you'll be bending,
> because you are serving exactly one file instead of a system).

Well, but I can just pretend my FUSE file is a block device, no?

Also, I just discovered something really interesting: FUSE allows you to
specify a single file as a mountpoint.

And you know what?  You can open the original file before you replace it
by the FUSE "filesystem".

So my fun interface is going to looks like this:

$ qemu-img fuse foo.qcow2

And then your foo.qcow2 is a raw image until the next "fusermount -u
foo.qcow2"!  Isn't that fun?

>>> Also, switch the test to use an offset of 0 instead of 1,
>>> to test skip= and seek= on their own; as it is, this is
>>> effectively quadrupling the test runtime, which starts
>>> to make this test borderline on whether it should still
>>> belong to './check -g quick'.  And I didn't bother to
>>> reindent the test shell code for the new nested loop.
>>
>> In my opinion, it should no longer belong to quick.  It takes 8 s on my
>> tmpfs.  My border is somewhere around 2 or 3; and I haven't yet decided
>> whether that's on tmpfs or SSD.
> 
> I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32
> iterations with this patch; my observed times went from 1s to 2s to 7s
> on SSD ext4. Yeah, for v2, I'll drop it from quick.

Thanks!

>>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>>           size = dd.count * in.bsz;
>>>       }
>>>
>>> -    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>>> +    if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
>>
>> What about overflows in out.offset * out.bsz?
> 
> I've had enough of my eyes bleeding on all the code repeatedly scaling
> things. For v2, I'm strongly considering a cleanup patch that reads all
> input, then scales all values into bytes, and THEN performs any
> additional math in a single unit, just so the additions become easier to
> reason about.

Haha.  I won't object.

>>> +        error_report("Seek too large for '%s'", out.filename);
>>> +        ret = -1;
>>> +        goto out;
>>
>> Real dd doesn't seem to error out (it just reports an error).  I don't
>> know whether that makes any difference, though.
> 
> But where does the data get written if you can't actually seek that far
> into the file?

Well, the stats printed say it doesn't write anything.  So that's why I
don't know whether it makes any difference.

>>
>> The test looks good to me.
> 
> Other than my creative indentation levels ;)

I like them.

I mean, usually I just don't indent anything when adding to a test case.
 I do it like this:


Original code:

...
qemu-img (something) $TEST_IMG
...


Post-my-patch:

for opt in x y; do
...
qemu-img (something) $opt $TEST_IMG
...
done


And I know I'm not the only one.  So, yeah, I liked your more creative
solution.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]