qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH 00/14] dataplane: optimization and multi virtqueue s


From: Ming Lei
Subject: [Qemu-devel] [PATCH 00/14] dataplane: optimization and multi virtqueue support
Date: Wed, 30 Jul 2014 19:39:33 +0800

Hi,

These patches bring up below 4 changes:

        - introduce selective coroutine bypass mechanism
        for improving performance of virtio-blk dataplane with
        raw format image

        - introduce object allocation pool and apply it to
        virtio-blk dataplane for improving its performance

        - linux-aio changes: fixing for cases of -EAGAIN and partial
        completion, increase max events to 256, and remove one unuseful
        fields in 'struct qemu_laiocb'

        - support multi virtqueue for virtio-blk dataplane

The virtio-blk multi virtqueue feature will be added to virtio spec 1.1[1],
and the 3.17 linux kernel[2] will support the feature in virtio-blk driver.
For those who wants to play the stuff, the kernel side patche can be found
in either Jens's block tree[3] or linux-next[4].

Below fio script running from VM is used for test improvement of these patches:

        [global]
        direct=1
        size=128G
        bsrange=4k-4k
        timeout=120
        numjobs=${JOBS}
        ioengine=libaio
        iodepth=64
        filename=/dev/vdc
        group_reporting=1

        [f]
        rw=randread

One quadcore VM(8G RAM) is created in below host to run above fio test:

        - server(16cores: 8 physical cores, 2 threads per physical core)

Follows the test result on throughput improvement(IOPS) with
this patchset(4 virtqueues per virito-blk device) against QEMU
2.1.0-rc5: 30% throughput improvement can be observed, and
scalability for parallel I/Os is improved more(80% throughput
improvement is observed in case of 4 JOBS).

>From above result, we can see both scalability and performance
get improved a lot.

After commit 580b6b2aa2(dataplane: use the QEMU block
layer for I/O), average time for submiting one single
request has been increased a lot, as my trace, the average
time taken for submiting one request has been doubled even
though block plug&unplug mechanism is introduced to
ease its effect. That is why this patchset introduces
selective coroutine bypass mechanism and object allocation
pool for saving the time first. Based on QEMU 2.0, only
single virtio-blk dataplane multi virtqueue patch can get
better improvement than current result[5].

TODO:
        - optimize block layer for linux aio, so that
        more time can be saved for submitting request
        - support more than one aio-context for improving
        virtio-blk performance

 async.c                         |    1 +
 block.c                         |  129 ++++++++++++++++++-----
 block/linux-aio.c               |   93 +++++++++++-----
 block/raw-posix.c               |   34 ++++++
 hw/block/dataplane/virtio-blk.c |  221 ++++++++++++++++++++++++++++++---------
 hw/block/virtio-blk.c           |   32 +++++-
 hw/net/virtio-net.c             |    4 +-
 hw/virtio/dataplane/vring.c     |   23 +++-
 hw/virtio/virtio.c              |   23 ++--
 include/block/aio.h             |   13 +++
 include/block/block.h           |    9 ++
 include/block/coroutine.h       |    8 ++
 include/block/coroutine_int.h   |    5 +
 include/hw/virtio/virtio-blk.h  |   13 +++
 include/hw/virtio/virtio.h      |   13 ++-
 include/qemu/obj_pool.h         |   64 ++++++++++++
 qemu-coroutine-lock.c           |    4 +-
 qemu-coroutine.c                |   33 ++++++
 18 files changed, 600 insertions(+), 122 deletions(-)


[1], http://marc.info/?l=linux-api&m=140486843317107&w=2
[2], http://marc.info/?l=linux-api&m=140418368421229&w=2
[3], http://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git/ 
#for-3.17/drivers
[4], https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/
[5], http://marc.info/?l=linux-api&m=140377573830230&w=2

Thanks,





reply via email to

[Prev in Thread] Current Thread [Next in Thread]