[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode
From: |
Karl Rister |
Subject: |
Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode |
Date: |
Mon, 14 Nov 2016 09:36:44 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 11/14/2016 09:26 AM, Stefan Hajnoczi wrote:
> On Fri, Nov 11, 2016 at 01:59:25PM -0600, Karl Rister wrote:
>> On 11/09/2016 11:13 AM, Stefan Hajnoczi wrote:
>>> Recent performance investigation work done by Karl Rister shows that the
>>> guest->host notification takes around 20 us. This is more than the
>>> "overhead"
>>> of QEMU itself (e.g. block layer).
>>>
>>> One way to avoid the costly exit is to use polling instead of notification.
>>> The main drawback of polling is that it consumes CPU resources. In order to
>>> benefit performance the host must have extra CPU cycles available on
>>> physical
>>> CPUs that aren't used by the guest.
>>>
>>> This is an experimental AioContext polling implementation. It adds a
>>> polling
>>> callback into the event loop. Polling functions are implemented for
>>> virtio-blk
>>> virtqueue guest->host kick and Linux AIO completion.
>>>
>>> The QEMU_AIO_POLL_MAX_NS environment variable sets the number of
>>> nanoseconds to
>>> poll before entering the usual blocking poll(2) syscall. Try setting this
>>> variable to the time from old request completion to new virtqueue kick.
>>>
>>> By default no polling is done. The QEMU_AIO_POLL_MAX_NS must be set to get
>>> any
>>> polling!
>>>
>>> Karl: I hope you can try this patch series with several QEMU_AIO_POLL_MAX_NS
>>> values. If you don't find a good value we should double-check the tracing
>>> data
>>> to see if this experimental code can be improved.
>>
>> Stefan
>>
>> I ran some quick tests with your patches and got some pretty good gains,
>> but also some seemingly odd behavior.
>>
>> These results are for a 5 minute test doing sequential 4KB requests from
>> fio using O_DIRECT, libaio, and IO depth of 1. The requests are
>> performed directly against the virtio-blk device (no filesystem) which
>> is backed by a 400GB NVme card.
>>
>> QEMU_AIO_POLL_MAX_NS IOPs
>> unset 31,383
>> 1 46,860
>> 2 46,440
>> 4 35,246
>> 8 34,973
>> 16 46,794
>> 32 46,729
>> 64 35,520
>> 128 45,902
>
> The environment variable is in nanoseconds. The range of values you
> tried are very small (all <1 usec). It would be interesting to try
> larger values in the ballpark of the latencies you have traced. For
> example 2000, 4000, 8000, 16000, and 32000 ns.
Agreed. As I alluded to in another post, I decided to start at 1 and
double the values until I saw a difference with the expectation that it
would have to get quite large before that happened. The results went in
a different direction, and then I got distracted by the variation at
certain points. I figured that by itself the fact that noticeable
improvements were possible with such low values was interesting.
I will definitely continue the progression and capture some larger values.
>
> Very interesting that QEMU_AIO_POLL_MAX_NS=1 performs so well without
> much CPU overhead.
>
>> I found the results for 4, 8, and 64 odd so I re-ran some tests to check
>> for consistency. I used values of 2 and 4 and ran each 5 times. Here
>> is what I got:
>>
>> Iteration QEMU_AIO_POLL_MAX_NS=2 QEMU_AIO_POLL_MAX_NS=4
>> 1 46,972 35,434
>> 2 46,939 35,719
>> 3 47,005 35,584
>> 4 47,016 35,615
>> 5 47,267 35,474
>>
>> So the results seem consistent.
>
> That is interesting. I don't have an explanation for the consistent
> difference between 2 and 4 ns polling time. The time difference is so
> small yet the IOPS difference is clear.
>
> Comparing traces could shed light on the cause for this difference.
>
>> I saw some discussion on the patches made which make me think you'll be
>> making some changes, is that right? If so, I may wait for the updates
>> and then we can run the much more exhaustive set of workloads
>> (sequential read and write, random read and write) at various block
>> sizes (4, 8, 16, 32, 64, 128, and 256) and multiple IO depths (1 and 32)
>> that we were doing when we started looking at this.
>
> I'll send an updated version of the patches.
>
> Stefan
>
--
Karl Rister <address@hidden>
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, (continued)
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Paolo Bonzini, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Fam Zheng, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Paolo Bonzini, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/15
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Fam Zheng, 2016/11/16
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode,
Karl Rister <=
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Karl Rister, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Paolo Bonzini, 2016/11/14
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/15
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Karl Rister, 2016/11/15
Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, no-reply, 2016/11/13
Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Christian Borntraeger, 2016/11/14
Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Christian Borntraeger, 2016/11/14