[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host cdrom device |
Date: |
Tue, 8 Jan 2019 13:46:50 +0100 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
Am 29.12.2018 um 07:33 hat Ying Fang geschrieben:
> Hi.
> Recently one of our customer complained about the I/O performance of QEMU
> emulated host cdrom device.
> I did some investigation on it and there was still some point I could not
> figure out. So I had to ask for your help.
>
> Here is the application scenario setup by our customer.
> filename.iso /dev/sr0 /dev/cdrom
> remote client --> host(cdemu) --> Linux VM
> (1)A remote client maps an iso file to x86 host machine through network using
> tcp.
> (2)The cdemu daemon then load it as a local virtual cdrom disk drive.
> (3)A VM is launched with the virtual cdrom disk drive configured.
> The VM can make use of this virtual cdrom to install an OS in the iso file.
>
> The network bandwith btw the remote client and host is 100Mbps, we test I/O
> perf using: dd if=/dev/sr0 of=/dev/null bs=32K count=100000.
> And we have
> (1) I/O perf of host side /dev/sr0 is 11MB/s;
> (2) I/O perf of /dev/cdrom inside VM is 3.8MB/s.
>
> As we can see, I/O perf of cdrom inside the VM is about 34.5% compared with
> host side.
> FlameGraph is used to find out the bottleneck of this operation and we find
> out that too much time is occupied by calling *bdrv_is_inserted*.
> Then we dig into the code and figure out that the ioctl in
> *cdrom_is_inserted* takes too much time, because it triggers
> io_schdule_timeout in kernel.
> In the code path of emulated cdrom device, each DMA I/O request consists of
> several *bdrv_is_inserted*, which degrades the I/O performance by about 31%
> in our test.
> static bool cdrom_is_inserted(BlockDriverState *bs)
> {
> BDRVRawState *s = bs->opaque;
> int ret;
>
> ret = ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
> return ret == CDS_DISC_OK;
> }
> A flamegraph svg file (cdrom.svg) is attachieved in this email to show the
> code timing profile we've tested.
>
> So here is my question.
> (1) Why do we regularly check the presence of a cdrom disk drive in the code
> path? Can we do it asynchronously?
> (2) Can we drop some check point in the code path to improve the performance?
> Thanks.
I'm actually not sure why so many places check it. Just letting an I/O
request fail if the CD was removed would probably be easier.
To try out whether that would improve performance significantly, you
could try to use the host_device backend rather than the host_cdrom
backend. That one doesn't implement .bdrv_is_inserted, so the operation
will be cheap (just return true unconditionally).
You will also lose eject/lock passthrough when doing so, so this is not
the final solution, but if it proves to be a lot faster, we can check
where bdrv_is_inserted() calls are actually important (if anywhere) and
hopefully remove some even for the host_cdrom case.
Kevin
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host cdrom device, fangying, 2019/01/22