[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option
From: |
Max Reitz |
Subject: |
Re: [Qemu-devel] [PATCH 2/2] qemu-img: Add dd seek= option |
Date: |
Thu, 16 Aug 2018 04:49:49 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
On 2018-08-16 04:39, Eric Blake wrote:
> On 08/15/2018 09:20 PM, Max Reitz wrote:
>> On 2018-08-15 04:56, Eric Blake wrote:
>>> For feature parity with dd, we want to be able to specify
>>> the offset within the output file, just as we can specify
>>> the offset for the input (in particular, this makes copying
>>> a subset range of guest-visible bytes from one file to
>>> another much easier).
>>
>> In my opinion, we do not want feature parity with dd. What we do want
>> is feature parity with convert.
>
> Well, convert is lacking a way to specify a subset of one file to move
> to a (possibly different) subset of the other. I'm fine if we want to
> enhance convert to do the things that right now require a dd-alike
> interface (namely, limiting the copying to less than the full file, and
> choosing the offset at which to start [before this patch] or write to
> [with this patch]).
Yes, I would want that.
> If convert were more powerful, I'd be fine dropping 'qemu-img dd' after
> a proper deprecation period.
Technically it has those features already, with the raw block driver's
offset and size parameters.
>>> The code style for 'qemu-img dd' was pretty hard to read;
>>> unfortunately this patch focuses only on adding the new
>>> feature in the existing style rather than trying to improve
>>> the overall flow, other than switching octal constants to
>>> hex. Oh well.
>>
>> No, the real issue is that dd is still not implemented just as a
>> frontend to convert. Which it should be. I'm not sure dd was a very
>> good idea from the start, and now it should ideally be a frontend to
>> convert.
>>
>> (My full opinion on the matter: dd has a horrible interface. I don't
>> quite see why we replicated that inside qemu-img. Also, if you want to
>> use dd, why not use qemu-nbd + Linux nbd device + real dd?)
>
> Because of performance: qemu-nbd + Linux nbd device + real dd is one
> more layer of data copying (each write() from dd goes to kernel, then is
> sent to qemu-nbd in userspace as a socket message before being sent back
> to the kernel to actually write() to the final destination) compared to
> just doing it all in one process (write() lands in the final destination
> with no further user space bouncing). And because the additional steps
> to set it up are awkward (see my other email where I rant about losing
> the better part of today to realizing that 'dd ...; qemu-nbd -d
> /dev/nbd1' loses data if you omit conv=fdatasync).
I can see the sync problems, but is the performance really that much worse?
>> ((That gave me a good idea. Actually, it's probably not such a good
>> idea, but I guess I'll do it in my spare time anyway. A qemu-img fuse
>> might be nice which represents an image as a raw image at some mount
>> point. Benefits over qemu-nbd: (1) You don't need root, (2) you don't
>> need to type modprobe nbd.))
>
> So the kernel->userspace translation would be happening via the FUSE
> interface instead of the NBD interface. Data still bounces around just
> as much, but it might be a fun project. Does fuse behave well when
> serving exactly one file at the mountpoint, rather than the more typical
> file system rooted in a directory? NBD at least has the benefit of
> claiming to be a block device all along, rather than complicating the
> user interface with POSIX file system rules (which you'll be bending,
> because you are serving exactly one file instead of a system).
Well, but I can just pretend my FUSE file is a block device, no?
Also, I just discovered something really interesting: FUSE allows you to
specify a single file as a mountpoint.
And you know what? You can open the original file before you replace it
by the FUSE "filesystem".
So my fun interface is going to looks like this:
$ qemu-img fuse foo.qcow2
And then your foo.qcow2 is a raw image until the next "fusermount -u
foo.qcow2"! Isn't that fun?
>>> Also, switch the test to use an offset of 0 instead of 1,
>>> to test skip= and seek= on their own; as it is, this is
>>> effectively quadrupling the test runtime, which starts
>>> to make this test borderline on whether it should still
>>> belong to './check -g quick'. And I didn't bother to
>>> reindent the test shell code for the new nested loop.
>>
>> In my opinion, it should no longer belong to quick. It takes 8 s on my
>> tmpfs. My border is somewhere around 2 or 3; and I haven't yet decided
>> whether that's on tmpfs or SSD.
>
> I took 4 iterations pre-patch, to 8 iterations after patch 1, to 32
> iterations with this patch; my observed times went from 1s to 2s to 7s
> on SSD ext4. Yeah, for v2, I'll drop it from quick.
Thanks!
>>> @@ -4574,7 +4592,14 @@ static int img_dd(int argc, char **argv)
>>> size = dd.count * in.bsz;
>>> }
>>>
>>> - qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
>>> + if (dd.flags & C_SEEK && out.offset * out.bsz > INT64_MAX - size) {
>>
>> What about overflows in out.offset * out.bsz?
>
> I've had enough of my eyes bleeding on all the code repeatedly scaling
> things. For v2, I'm strongly considering a cleanup patch that reads all
> input, then scales all values into bytes, and THEN performs any
> additional math in a single unit, just so the additions become easier to
> reason about.
Haha. I won't object.
>>> + error_report("Seek too large for '%s'", out.filename);
>>> + ret = -1;
>>> + goto out;
>>
>> Real dd doesn't seem to error out (it just reports an error). I don't
>> know whether that makes any difference, though.
>
> But where does the data get written if you can't actually seek that far
> into the file?
Well, the stats printed say it doesn't write anything. So that's why I
don't know whether it makes any difference.
>>
>> The test looks good to me.
>
> Other than my creative indentation levels ;)
I like them.
I mean, usually I just don't indent anything when adding to a test case.
I do it like this:
Original code:
...
qemu-img (something) $TEST_IMG
...
Post-my-patch:
for opt in x y; do
...
qemu-img (something) $opt $TEST_IMG
...
done
And I know I'm not the only one. So, yeah, I liked your more creative
solution.
Max
signature.asc
Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 0/2] Improve qemu-img dd, Eric Blake, 2018/08/15
Re: [Qemu-devel] [PATCH 0/2] Improve qemu-img dd, no-reply, 2018/08/16
Re: [Qemu-devel] [PATCH 0/2] Improve qemu-img dd, no-reply, 2018/08/16
Re: [Qemu-devel] [PATCH 0/2] Improve qemu-img dd, Eric Blake, 2018/08/19