qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] virtio: fix IO request length in virtio SCS


From: Harris, James R
Subject: Re: [Qemu-devel] [PATCH 2/2] virtio: fix IO request length in virtio SCSI/block
Date: Mon, 18 Dec 2017 16:16:08 +0000

> On Dec 18, 2017, at 6:38 AM, Stefan Hajnoczi <address@hidden> wrote:
> 
> On Fri, Dec 15, 2017 at 06:02:50PM +0300, Denis V. Lunev wrote:
>> Linux guests submit IO requests no longer than PAGE_SIZE * max_seg
>> field reported by SCSI controler. Thus typical sequential read with
>> 1 MB size results in the following pattern of the IO from the guest:
>>  8,16   1    15754     2.766095122  2071  D   R 2095104 + 1008 [dd]
>>  8,16   1    15755     2.766108785  2071  D   R 2096112 + 1008 [dd]
>>  8,16   1    15756     2.766113486  2071  D   R 2097120 + 32 [dd]
>>  8,16   1    15757     2.767668961     0  C   R 2095104 + 1008 [0]
>>  8,16   1    15758     2.768534315     0  C   R 2096112 + 1008 [0]
>>  8,16   1    15759     2.768539782     0  C   R 2097120 + 32 [0]
>> The IO was generated by
>>  dd if=/dev/sda of=/dev/null bs=1024 iflag=direct
>> 
>> This effectively means that on rotational disks we will observe 3 IOPS
>> for each 2 MBs processed. This definitely negatively affects both
>> guest and host IO performance.
>> 
>> The cure is relatively simple - we should report lengthy scatter-gather
>> ability of the SCSI controller. Fortunately the situation here is very
>> good. VirtIO transport layer can accomodate 1024 items in one request
>> while we are using only 128. This situation is present since almost
>> very beginning. 2 items are dedicated for request metadata thus we
>> should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg.
>> 
>> The following pattern is observed after the patch:
>>  8,16   1     9921     2.662721340  2063  D   R 2095104 + 1024 [dd]
>>  8,16   1     9922     2.662737585  2063  D   R 2096128 + 1024 [dd]
>>  8,16   1     9923     2.665188167     0  C   R 2095104 + 1024 [0]
>>  8,16   1     9924     2.665198777     0  C   R 2096128 + 1024 [0]
>> which is much better.
>> 
>> The dark side of this patch is that we are tweaking guest visible
>> parameter, though this should be relatively safe as above transport
>> layer support is present in QEMU/host Linux for a very long time.
>> The patch adds configurable property for VirtIO SCSI with a new default
>> and hardcode option for VirtBlock which does not provide good
>> configurable framework.
>> 
>> Signed-off-by: Denis V. Lunev <address@hidden>
>> CC: "Michael S. Tsirkin" <address@hidden>
>> CC: Stefan Hajnoczi <address@hidden>
>> CC: Kevin Wolf <address@hidden>
>> CC: Max Reitz <address@hidden>
>> CC: Paolo Bonzini <address@hidden>
>> CC: Richard Henderson <address@hidden>
>> CC: Eduardo Habkost <address@hidden>
>> ---
>> include/hw/compat.h             | 17 +++++++++++++++++
>> include/hw/virtio/virtio-blk.h  |  1 +
>> include/hw/virtio/virtio-scsi.h |  1 +
>> hw/block/virtio-blk.c           |  4 +++-
>> hw/scsi/vhost-scsi.c            |  2 ++
>> hw/scsi/vhost-user-scsi.c       |  2 ++
>> hw/scsi/virtio-scsi.c           |  4 +++-
>> 7 files changed, 29 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/hw/compat.h b/include/hw/compat.h
>> index 026fee9..b9be5d7 100644
>> --- a/include/hw/compat.h
>> +++ b/include/hw/compat.h
>> @@ -2,6 +2,23 @@
>> #define HW_COMPAT_H
>> 
>> #define HW_COMPAT_2_11 \
>> +    {\
>> +        .driver   = "virtio-blk-device",\
>> +        .property = "max_segments",\
>> +        .value    = "126",\
>> +    },{\
>> +        .driver   = "vhost-scsi",\
>> +        .property = "max_segments",\
>> +        .value    = "126",\
>> +    },{\
>> +        .driver   = "vhost-user-scsi",\
>> +        .property = "max_segments",\
>> +        .value    = "126",\
> 
> Existing vhost-user-scsi slave programs might not expect up to 1022
> segments.  Hopefully we can get away with this change since there are
> relatively few vhost-user-scsi slave programs.
> 
> CCed Felipe (Nutanix) and Jim (SPDK) in case they have comments.

SPDK vhost-user targets only expect max 128 segments.  They also pre-allocate 
I/O task structures when QEMU connects to the vhost-user device.

Supporting up to 1022 segments would result in significantly higher memory 
usage, reduction in I/O queue depth processed by the vhost-user target, or 
having to dynamically allocate I/O task structures - none of which are ideal.

What if this was just bumped from 126 to 128?  I guess I’m trying to understand 
the level of guest and host I/O performance that is gained with this patch.  
One I/O per 512KB vs. one I/O per 4MB - we are still only talking about a few 
hundred IO/s difference.

-Jim



reply via email to

[Prev in Thread] Current Thread [Next in Thread]