qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/8] virtio-blk: multiqueue support


From: Roman Penyaev
Subject: Re: [Qemu-devel] [PATCH v2 0/8] virtio-blk: multiqueue support
Date: Sat, 4 Jun 2016 17:49:16 +0200

Hi,

On Sat, Jun 4, 2016 at 12:26 AM, Stefan Hajnoczi <address@hidden> wrote:
> On Thu, Jun 02, 2016 at 05:19:41PM -0700, Stefan Hajnoczi wrote:
>> On Mon, May 30, 2016 at 06:25:58PM -0700, Stefan Hajnoczi wrote:
>> > v2:
>> >  * Simplify s->rq live migration [Paolo]
>> >  * Use more efficient bitmap ops for batch notification [Paolo]
>> >  * Fix perf regression due to batch notify BH in wrong AioContext 
>> > [Christian]
>> >
>> > The virtio_blk guest driver has supported multiple virtqueues since Linux 
>> > 3.17.
>> > This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
>> > device.
>> >
>> > Ming Lei sent patches previously but these were not merged.  This series
>> > implements virtio-blk multiqueue for QEMU from scratch since the codebase 
>> > has
>> > changed.  Live migration support for s->rq was also missing from the 
>> > previous
>> > series and has been added.
>> >
>> > It's important to note that QEMU's block layer does not support multiqueue 
>> > yet.
>> > Therefore virtio-blk device processes all virtqueues in the same AioContext
>> > (IOThread).  Further work is necessary to take advantage of multiqueue 
>> > support
>> > in QEMU's block layer once it becomes available.
>> >
>> > I will post performance results once they are ready.
>> >
>> > Stefan Hajnoczi (8):
>> >   virtio-blk: use batch notify in non-dataplane case
>> >   virtio-blk: tell dataplane which vq to notify
>> >   virtio-blk: associate request with a virtqueue
>> >   virtio-blk: add VirtIOBlockConf->num_queues
>> >   virtio-blk: multiqueue batch notify
>> >   virtio-blk: live migrateion s->rq with multiqueue
>> >   virtio-blk: dataplane multiqueue support
>> >   virtio-blk: add num-queues device property
>> >
>> >  hw/block/dataplane/virtio-blk.c |  68 +++++++++++----------
>> >  hw/block/dataplane/virtio-blk.h |   2 +-
>> >  hw/block/virtio-blk.c           | 129 
>> > +++++++++++++++++++++++++++++++++++-----
>> >  include/hw/virtio/virtio-blk.h  |   8 ++-
>> >  4 files changed, 159 insertions(+), 48 deletions(-)
>>
>> There is a significant performance regression due to batch notify:
>>
>> $ ./analyze.py runs/
>> Name                                   IOPS   Error
>> unpatched-d6550e9ed2             19269820.2 ± 1.36%
>> unpatched-d6550e9ed2-2           19567358.4 ± 2.42%
>> v2-batch-only-f27ed9a4d9         16252227.2 ± 6.09%
>> v2-no-dataplane                  14560225.4 ± 5.16%
>> v2-no-dataplane-2                14622535.6 ± 10.08%
>> v2-no-dataplane-3                13960670.8 ± 7.11%
>>
>> unpatched-d6550e9ed2 is without this patch series.
>> v2-batch-only-f27ed9a4d9 is with Patch 1 only.  v2-no-dataplane is with
>> the patch series (dataplane is not enabled in any of these tests).
>>
>> Next I will compare unpatched dataplane against patched dataplane.  I
>> want to make sure Patch 1 faithfully moved batch notify from dataplane
>> code to generic virtio-blk code without affecting performance.
>>
>> If there is no difference then it means batch notify decreases
>> performance for some workloads (obviously not the same workload that
>> Ming Lei was running).
>
> It turns out that Patch 1 slows down dataplane even though the code
> looks equivalent.  After a lot of poking it turned out to be a subtle
> issue:
>
> The order of BHs in the AioContext->first_bh list affects performance.
> Linux AIO (block/linux-aio.c) invokes completion callbacks from a BH.
> Performance is much better if virtio-blk.c's batch BH is after the
> completion BH.
>
> The "fast" ordering notifies the guest in ~300 nanoseconds after the
> last request completion.
>
> The "slow" ordering sometimes takes 100 microseconds after the last
> request completion before the guest is notified.  It probably depends on
> whether the event loop is kicked by another source.
>
> I'm thinking of scrapping the batch BH and instead using a notify
> plug/unplug callback to suppress notification until the last request has
> been processed.
>
> I also checked that batch notification does indeed improve performance
> compared to no batching.  It offers a nice boost so we do want to port
> the feature from dataplane to non-dataplane.
>
> For the time being: consider this patch series broken due to the
> performance regression.
>
> Stefan

Stefan, could you please share your loads? I tried on my fio scripts and did
not notice any significant difference.  Would be interesting to understand
the root cause.

--
Roman



reply via email to

[Prev in Thread] Current Thread [Next in Thread]