[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host
From: |
John Snow |
Subject: |
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host cdrom device |
Date: |
Fri, 18 Jan 2019 17:43:30 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 |
On 1/15/19 9:48 PM, Ying Fang wrote:
>
>
> On 2019/1/16 4:15, John Snow wrote:
>>
>>
>> On 1/8/19 10:20 PM, Ying Fang wrote:
>>>
>>>
>>> On 2019/1/8 20:46, Kevin Wolf wrote:
>>>> Am 29.12.2018 um 07:33 hat Ying Fang geschrieben:
>>>>> Hi.
>>>>> Recently one of our customer complained about the I/O performance of QEMU
>>>>> emulated host cdrom device.
>>>>> I did some investigation on it and there was still some point I could not
>>>>> figure out. So I had to ask for your help.
>>>>>
>>>>> Here is the application scenario setup by our customer.
>>>>> filename.iso /dev/sr0 /dev/cdrom
>>>>> remote client --> host(cdemu) --> Linux VM
>>>>> (1)A remote client maps an iso file to x86 host machine through network
>>>>> using tcp.
>>>>> (2)The cdemu daemon then load it as a local virtual cdrom disk drive.
>>>>> (3)A VM is launched with the virtual cdrom disk drive configured.
>>>>> The VM can make use of this virtual cdrom to install an OS in the iso
>>>>> file.
>>>>>
>>>>> The network bandwith btw the remote client and host is 100Mbps, we test
>>>>> I/O perf using: dd if=/dev/sr0 of=/dev/null bs=32K count=100000.
>>>>> And we have
>>>>> (1) I/O perf of host side /dev/sr0 is 11MB/s;
>>>>> (2) I/O perf of /dev/cdrom inside VM is 3.8MB/s.
>>>>>
>>>>> As we can see, I/O perf of cdrom inside the VM is about 34.5% compared
>>>>> with host side.
>>>>> FlameGraph is used to find out the bottleneck of this operation and we
>>>>> find out that too much time is occupied by calling *bdrv_is_inserted*.
>>>>> Then we dig into the code and figure out that the ioctl in
>>>>> *cdrom_is_inserted* takes too much time, because it triggers
>>>>> io_schdule_timeout in kernel.
>>>>> In the code path of emulated cdrom device, each DMA I/O request consists
>>>>> of several *bdrv_is_inserted*, which degrades the I/O performance by
>>>>> about 31% in our test.
>>>>> static bool cdrom_is_inserted(BlockDriverState *bs)
>>>>> {
>>>>> BDRVRawState *s = bs->opaque;
>>>>> int ret;
>>>>>
>>>>> ret = ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
>>>>> return ret == CDS_DISC_OK;
>>>>> }
>>>>> A flamegraph svg file (cdrom.svg) is attachieved in this email to show
>>>>> the code timing profile we've tested.
>>>>>
>>>>> So here is my question.
>>>>> (1) Why do we regularly check the presence of a cdrom disk drive in the
>>>>> code path? Can we do it asynchronously?
>>
>> The ATAPI emulator needs to know if it has storage present to carry out
>> the request or to signal an error, so it checks. It might be the case
>> that the VM operator forcibly dismounted network storage without
>> awareness from the guest. (This is basically emulation of the case when
>> a user mechanically forces a CDROM tray open.)
>>
>> I wasn't aware this check was so slow.
> It is fast enough in most case, however it may be slow if the cdrom drive is
> a virtual drive mapped from remote client.
> This is showed in
> https://raw.githubusercontent.com/fangying/fangying.github.io/master/94E119FA-8BA1-41AB-A26A-FBDC635BCD2C.png
> The reason is ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT) in
> cdrom_is_inserted takes much time in this scenario.
>
>>
>> Maybe we can just cache blk_is_inserted -- I don't remember if it's
>> possible to eject media without awareness from the device model, but if
>> this check winds up being slow in some configurations we can cache it.
> To cache media status is a good idea here, we check blk_is_inserted instead
> and modify it when media status is changed.
>>
>> This won't help the bdrv_check_byte_request calls, though, just ones
>> from the device model, see below
>>
>>>>> (2) Can we drop some check point in the code path to improve the
>>>>> performance?
>>>>> Thanks.
>>>>
>>>> I'm actually not sure why so many places check it. Just letting an I/O
>>>> request fail if the CD was removed would probably be easier.
>>>>
>>> You can see those check points as showed in the attached flamegraph file in
>>> this thread (rename it to cdrom.svg).
>>> It is called mainly in bdrv_check_byte_request and ide_atapi_cmd, and only
>>> the host_cdrom backend implements
>>> it using cdrom_is_inserted.
>>>
>>
>> in ide_atapi_cmd, try replacing:
>>
>> `!s->tray_open && blk_is_inserted(s->blk) && s->cdrom_changed)`
>>
>> with
>>
>> `!s->tray_open && s->cdrom_changed && blk_is_inserted(s->blk))`
>>
>> which should hopefully short-circuit calls to blk_is_inserted if
>> s->cdrom_changed is false, but it doesn't do much to fix the subsequent
>> call:
>>
>> ` if ((cmd->flags & CHECK_READY) &&
>> (!media_present(s) || !blk_is_inserted(s->blk)))`
>>
>> which is unfortunately going to check blk_is_inserted quite a lot
>> because the read commands are tagged CHECK_READY...
> Yes you are right.
>>
>>
>> As alluded to above, too, I'm not sure what I can do about
>> bdrv_check_byte_request -- what happens if the io.c helpers don't
>> actually check this? Will they fail gracefully?
>>
>> I guess there's one way to find out.
>
> I did some test here. As we can see only cdrom bdrv implements
> bdrv_is_inserted,
> then I skip check presence of cdrom device.
>
> I test I/O using dd if=/dev/sr0 of=/dev/null bs=32K count=1000
> and manually disconnect the remote iso client from the host to inject error.
>
> I found it failed gracefully and send an I/O error event.
> monitor_qapi_event_emit[423]|: {"timestamp": {"seconds": 1547466098,
> "microseconds": 740265}, "event": "BLOCK_IO_ERROR", "data":
> {"device": "drive-ide0-0-1", "node-name": "#block369", "reason":
> "Input/output error", "operation": "read", "action": "report"}}
>
> diff --git a/block.c b/block.c
> index 45145c5..5371ce2 100644
> --- a/block.c
> +++ b/block.c
> @@ -3479,6 +3479,11 @@ bool bdrv_is_inserted(BlockDriverState *bs)
> if (!drv) {
> return false;
> }
> +
> + if (bdrv_is_read_only(bs)) {
> + return true;
> + }
> +
> if (drv->bdrv_is_inserted) {
> return drv->bdrv_is_inserted(bs);
> }
>
> Thanks.
>>
>>>> To try out whether that would improve performance significantly, you
>>>> could try to use the host_device backend rather than the host_cdrom
>>>> backend. That one doesn't implement .bdrv_is_inserted, so the operation
>>>> will be cheap (just return true unconditionally).
>>>>
>>> The remote client maps filename.iso to host virtual cdrom drive.
>>> We use xml like
>>> <disk type='block' device='cdrom'>
>>> <driver name='qemu' type='raw' cache='none' io='native'/>
>>> <source dev='/dev/sr0'/>
>>> <target dev='hdb' bus='ide'/>
>>> <readonly/>
>>> <boot order='2'/>
>>> <address type='drive' controller='0' bus='0' target='0' unit='1'/>
>>> </disk>
>>> to attach the virtual cdrom drive into guest and backend bdrv is host_cdrom.
>>> Sorry I do not know how to use the host_device backend here.
>>>
>>>> You will also lose eject/lock passthrough when doing so, so this is not
>>>> the final solution, but if it proves to be a lot faster, we can check
>>>> where bdrv_is_inserted() calls are actually important (if anywhere) and
>>>> hopefully remove some even for the host_cdrom case.
>>>>
>>> You mean if we do not check *cdrom_is_inserted*, we may lose eject/lock
>>> here ?
>>> I wonder we used to check the presence of the cdrom device each time it is
>>> touched
>>> though I am not familiar with this feature.
>>>
>>>> Kevin
>>>>
>>>> .
>>>>
>>>
>>>
>>
>>
>>
>
Do you intend to send a patch along? I doubt I'll be able to prioritize
this but I will look over any patches that get sent. I am a little
skeptical of the "read_only" optimization, I might need Kevin's input on
that one, but a patch would be a good place for that discussion.
--js
Re: [Qemu-devel] [RFC] Questions on the I/O performance of emulated host cdrom device, fangying, 2019/01/22