[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host

From: Ying Fang
Subject: Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host cdrom device
Date: Wed, 16 Jan 2019 10:48:23 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0

On 2019/1/16 4:15, John Snow wrote:
> On 1/8/19 10:20 PM, Ying Fang wrote:
>> On 2019/1/8 20:46, Kevin Wolf wrote:
>>> Am 29.12.2018 um 07:33 hat Ying Fang geschrieben:
>>>> Hi.
>>>> Recently one of our customer complained about the I/O performance of QEMU 
>>>> emulated host cdrom device.
>>>> I did some investigation on it and there was still some point I could not 
>>>> figure out. So I had to ask for your help.
>>>> Here is the application scenario setup by our customer.
>>>> filename.iso        /dev/sr0           /dev/cdrom
>>>> remote client    -->  host(cdemu)      -->  Linux VM
>>>> (1)A remote client maps an iso file to x86 host machine through network 
>>>> using tcp.
>>>> (2)The cdemu daemon then load it as a local virtual cdrom disk drive.
>>>> (3)A VM is launched with the virtual cdrom disk drive configured.
>>>> The VM can make use of this virtual cdrom to install an OS in the iso file.
>>>> The network bandwith btw the remote client and host is 100Mbps, we test 
>>>> I/O perf using: dd if=/dev/sr0 of=/dev/null bs=32K count=100000.
>>>> And we have
>>>> (1) I/O perf of host side /dev/sr0 is 11MB/s;
>>>> (2) I/O perf of /dev/cdrom inside VM is 3.8MB/s.
>>>> As we can see, I/O perf of cdrom inside the VM is about 34.5% compared 
>>>> with host side.
>>>> FlameGraph is used to find out the bottleneck of this operation and we 
>>>> find out that too much time is occupied by calling *bdrv_is_inserted*.
>>>> Then we dig into the code and figure out that the ioctl in 
>>>> *cdrom_is_inserted* takes too much time, because it triggers 
>>>> io_schdule_timeout in kernel.
>>>> In the code path of emulated cdrom device, each DMA I/O request consists 
>>>> of several *bdrv_is_inserted*, which degrades the I/O performance by about 
>>>> 31% in our test.
>>>> static bool cdrom_is_inserted(BlockDriverState *bs)
>>>> {
>>>>     BDRVRawState *s = bs->opaque;
>>>>     int ret;
>>>>     ret = ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
>>>>     return ret == CDS_DISC_OK;
>>>> }
>>>> A flamegraph svg file (cdrom.svg) is attachieved in this email to show the 
>>>> code timing profile we've tested.
>>>> So here is my question.
>>>> (1) Why do we regularly check the presence of a cdrom disk drive in the 
>>>> code path?  Can we do it asynchronously?
> The ATAPI emulator needs to know if it has storage present to carry out
> the request or to signal an error, so it checks. It might be the case
> that the VM operator forcibly dismounted network storage without
> awareness from the guest. (This is basically emulation of the case when
> a user mechanically forces a CDROM tray open.)
> I wasn't aware this check was so slow.
It is fast enough in most case, however it may be slow if the cdrom drive is a 
virtual drive mapped from remote client.
This is showed in 
The reason is ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT) in 
cdrom_is_inserted takes much time in this scenario.

> Maybe we can just cache blk_is_inserted -- I don't remember if it's
> possible to eject media without awareness from the device model, but if
> this check winds up being slow in some configurations we can cache it.
To cache media status is a good idea here, we check blk_is_inserted instead and 
modify it when media status is changed.
> This won't help the bdrv_check_byte_request calls, though, just ones
> from the device model, see below
>>>> (2) Can we drop some check point in the code path to improve the 
>>>> performance?
>>>> Thanks.
>>> I'm actually not sure why so many places check it. Just letting an I/O
>>> request fail if the CD was removed would probably be easier.
>> You can see those check points as showed in the attached flamegraph file in 
>> this thread (rename it to cdrom.svg).
>> It is called mainly in bdrv_check_byte_request and ide_atapi_cmd, and only 
>> the host_cdrom backend implements
>> it using cdrom_is_inserted.
> in ide_atapi_cmd, try replacing:
> `!s->tray_open && blk_is_inserted(s->blk) && s->cdrom_changed)`
> with
> `!s->tray_open && s->cdrom_changed && blk_is_inserted(s->blk))`
> which should hopefully short-circuit calls to blk_is_inserted if
> s->cdrom_changed is false, but it doesn't do much to fix the subsequent
> call:
> `    if ((cmd->flags & CHECK_READY) &&
>         (!media_present(s) || !blk_is_inserted(s->blk)))`
> which is unfortunately going to check blk_is_inserted quite a lot
> because the read commands are tagged CHECK_READY...
Yes you are right.
> As alluded to above, too, I'm not sure what I can do about
> bdrv_check_byte_request -- what happens if the io.c helpers don't
> actually check this? Will they fail gracefully?
> I guess there's one way to find out.

I did some test here. As we can see only cdrom bdrv implements bdrv_is_inserted,
then I skip check presence of cdrom device.

I test I/O using dd if=/dev/sr0 of=/dev/null bs=32K count=1000
and manually disconnect the remote iso client from the host to inject error.

I found it failed gracefully and send an I/O error event.
monitor_qapi_event_emit[423]|: {"timestamp": {"seconds": 1547466098, 
"microseconds": 740265}, "event": "BLOCK_IO_ERROR", "data":
{"device": "drive-ide0-0-1", "node-name": "#block369", "reason": "Input/output 
error", "operation": "read", "action": "report"}}

diff --git a/block.c b/block.c
index 45145c5..5371ce2 100644
--- a/block.c
+++ b/block.c
@@ -3479,6 +3479,11 @@ bool bdrv_is_inserted(BlockDriverState *bs)
     if (!drv) {
         return false;
+    if (bdrv_is_read_only(bs)) {
+       return true;
+    }
     if (drv->bdrv_is_inserted) {
         return drv->bdrv_is_inserted(bs);

>>> To try out whether that would improve performance significantly, you
>>> could try to use the host_device backend rather than the host_cdrom
>>> backend. That one doesn't implement .bdrv_is_inserted, so the operation
>>> will be cheap (just return true unconditionally).
>> The remote client maps filename.iso to host virtual cdrom drive.
>> We use xml like
>>     <disk type='block' device='cdrom'>
>>       <driver name='qemu' type='raw' cache='none' io='native'/>
>>       <source dev='/dev/sr0'/>
>>       <target dev='hdb' bus='ide'/>
>>       <readonly/>
>>       <boot order='2'/>
>>       <address type='drive' controller='0' bus='0' target='0' unit='1'/>
>>     </disk>
>> to attach the virtual cdrom drive into guest and backend bdrv is host_cdrom.
>> Sorry I do not know how to use the host_device backend here.
>>> You will also lose eject/lock passthrough when doing so, so this is not
>>> the final solution, but if it proves to be a lot faster, we can check
>>> where bdrv_is_inserted() calls are actually important (if anywhere) and
>>> hopefully remove some even for the host_cdrom case.
>> You mean if we do not check *cdrom_is_inserted*, we may lose eject/lock here 
>> ?
>> I wonder we used to check the presence of the cdrom device each time it is 
>> touched
>> though I am not familiar with this feature.
>>> Kevin
>>> .

reply via email to

[Prev in Thread] Current Thread [Next in Thread]