qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/7] virtio-blk: multiqueue support


From: Roman Penyaev
Subject: Re: [Qemu-devel] [PATCH v3 0/7] virtio-blk: multiqueue support
Date: Mon, 20 Jun 2016 15:29:43 +0200

Hi, Stefan.

On Mon, Jun 20, 2016 at 12:36 PM, Stefan Hajnoczi <address@hidden> wrote:
> On Tue, Jun 07, 2016 at 05:28:24PM +0100, Stefan Hajnoczi wrote:
>> v3:
>>  * Drop Patch 1 to batch guest notify for non-dataplane
>>
>>    The Linux AIO completion BH and the virtio-blk batch notify BH changed 
>> order
>>    in the AioContext->first_bh list as a side-effect of moving the BH from
>>    hw/block/dataplane/virtio-blk.c to hw/block/virtio-blk.c.  This caused a
>>    serious performance regression for both dataplane and non-dataplane.
>>
>>    I've decided not to move the BH in this series and work on a separate
>>    solution for making batch notify generic.
>>
>>    The remaining patches have been reordered and cleaned up.
>>
>>  * See performance data below.
>>
>> v2:
>>  * Simplify s->rq live migration [Paolo]
>>  * Use more efficient bitmap ops for batch notification [Paolo]
>>  * Fix perf regression due to batch notify BH in wrong AioContext [Christian]
>>
>> The virtio_blk guest driver has supported multiple virtqueues since Linux 
>> 3.17.
>> This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
>> device.
>>
>> Ming Lei sent patches previously but these were not merged.  This series
>> implements virtio-blk multiqueue for QEMU from scratch since the codebase has
>> changed.  Live migration support for s->rq was also missing from the previous
>> series and has been added.
>>
>> It's important to note that QEMU's block layer does not support multiqueue 
>> yet.
>> Therefore virtio-blk device processes all virtqueues in the same AioContext
>> (IOThread).  Further work is necessary to take advantage of multiqueue 
>> support
>> in QEMU's block layer once it becomes available.
>>
>> Performance results:
>>
>> Using virtio-blk-pci,num-queues=4 can produce a speed-up but -smp 4
>> introduces a lot of variance across runs.  No pinning was performed.
>>
>> Results show that there is no regression anymore, thanks to dropping the
>> batch notify BH patch.
>>
>> RHEL 7.2 guest on RHEL 7.2 host with 1 vcpu and 1 GB RAM unless otherwise
>> noted.  The default configuration of the Linux null_blk driver is used as
>> /dev/vdb.
>>
>> $ cat files/fio.job
>> [global]
>> filename=/dev/vdb
>> ioengine=libaio
>> direct=1
>> runtime=60
>> ramp_time=5
>> gtod_reduce=1
>>
>> [job1]
>> numjobs=4
>> iodepth=16
>> rw=randread
>> bs=4K
>>
>> $ ./analyze.py runs/
>> Name                                   IOPS   Error
>> unpatched-d6550e9ed2             19269820.2 ± 1.36%
>> unpatched-dataplane-d6550e9ed2   22351400.4 ± 1.07%
>> v3-dataplane                     22318511.2 ± 0.77%
>> v3-no-dataplane                  18936103.8 ± 1.12%
>> v3-queues-4-no-dataplane         19177021.8 ± 1.45%
>> v3-smp-4-no-dataplane            25509585.2 ± 29.50%
>> v3-smp-4-no-dataplane-no-mq      12466177.2 ± 7.88%
>>
>> Configuration:
>> Name                             Patched? Dataplane? SMP? MQ?
>> unpatched-d6550e9ed2                    N          N    N   N
>> unpatched-dataplane-d6550e9ed2          N          Y    N   N
>> v3-dataplane                            Y          Y    N   N
>> v3-no-dataplane                         Y          N    N   N
>> v3-queues-4-no-dataplane                Y          N    N   Y
>> v3-smp-4-no-dataplane                   Y          N    Y   Y
>> v3-smp-4-no-dataplane-no-mq             Y          N    Y   N
>>
>> SMP means -smp 4.
>> MQ means virtio-blk-pci,num-queues=4.
>>
>> Stefan Hajnoczi (7):
>>   virtio-blk: add VirtIOBlockConf->num_queues
>>   virtio-blk: multiqueue batch notify
>>   virtio-blk: tell dataplane which vq to notify
>>   virtio-blk: associate request with a virtqueue
>>   virtio-blk: live migrate s->rq with multiqueue
>>   virtio-blk: dataplane multiqueue support
>>   virtio-blk: add num-queues device property
>>
>>  hw/block/dataplane/virtio-blk.c | 81 
>> +++++++++++++++++++++++++++++------------
>>  hw/block/dataplane/virtio-blk.h |  2 +-
>>  hw/block/virtio-blk.c           | 52 +++++++++++++++++++++-----
>>  include/hw/virtio/virtio-blk.h  |  6 ++-
>>  4 files changed, 105 insertions(+), 36 deletions(-)
>
> Ping?

I have one minor note regarding the following test:

"Name                             Patched? Dataplane? SMP? MQ?"
"v3-queues-4-no-dataplane                Y          N    N   Y"

If I am not mistaken and understand your test description, it does not
make a lot sense to use queues when you have VCPUs=1 (i.e. SMP=N),
because block layer will not create nr_queues > CPUs number.  So what
I want to say is that in this test even you specify num_queues=4 for
virtio_blk, only 1 software queue will be created by the guest block
layer and then it will be mapped to only 1 HW queue, even you have
requested 4.  On host userspace side you will always receive IO in the
queue #0.

And I've rebased and tested all my changes on top of your latest v3 set.
The question is: are you interested in this up-to-date RFC of mine:
"simple multithreaded MQ implementation for bdrv_raw" which I sent
couple of weeks ago?  I can resend it by single merged commit once more.

--
Roman



reply via email to

[Prev in Thread] Current Thread [Next in Thread]