qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] is there a limit on the number of in-flight I/O operati


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] is there a limit on the number of in-flight I/O operations?
Date: Thu, 27 Aug 2015 17:48:14 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Aug 25, 2014 at 03:50:02PM -0600, Chris Friesen wrote:
> On 08/23/2014 01:56 AM, Benoît Canet wrote:
> >The Friday 22 Aug 2014 à 18:59:38 (-0600), Chris Friesen wrote :
> >>On 07/21/2014 10:10 AM, Benoît Canet wrote:
> >>>The Monday 21 Jul 2014 à 09:35:29 (-0600), Chris Friesen wrote :
> >>>>On 07/21/2014 09:15 AM, Benoît Canet wrote:
> >>>>>The Monday 21 Jul 2014 à 08:59:45 (-0600), Chris Friesen wrote :
> >>>>>>On 07/19/2014 02:45 AM, Benoît Canet wrote:
> >>>>>>
> >>>>>>>I think in the throttling case the number of in flight operation is 
> >>>>>>>limited by
> >>>>>>>the emulated hardware queue. Else request would pile up and throttling 
> >>>>>>>would be
> >>>>>>>inefective.
> >>>>>>>
> >>>>>>>So this number should be around: #define VIRTIO_PCI_QUEUE_MAX 64 or 
> >>>>>>>something like than that.
> >>>>>>
> >>>>>>Okay, that makes sense.  Do you know how much data can be written as 
> >>>>>>part of
> >>>>>>a single operation?  We're using 2MB hugepages for the guest memory, 
> >>>>>>and we
> >>>>>>saw the qemu RSS numbers jump from 25-30MB during normal operation up to
> >>>>>>120-180MB when running dbench.  I'd like to know what the worst-case 
> >>>>>>would
> >>>
> >>>Sorry I didn't understood this part at first read.
> >>>
> >>>In the linux guest can you monitor:
> >>>address@hidden:~$ cat /sys/class/block/xyz/inflight ?
> >>>
> >>>This would give us a faily precise number of the requests actually in 
> >>>flight between the guest and qemu.
> >>
> >>
> >>After a bit of a break I'm looking at this again.
> >>
> >
> >Strange.
> >
> >I would use dd with the flag oflag=nocache to make sure the write request
> >does not do in the guest cache though.
> >
> >Best regards
> >
> >Benoît
> >
> >>While doing "dd if=/dev/zero of=testfile bs=1M count=700" in the guest, I
> >>got a max "inflight" value of 181.  This seems quite a bit higher than
> >>VIRTIO_PCI_QUEUE_MAX.
> >>
> >>I've seen throughput as high as ~210 MB/sec, which also kicked the RSS
> >>numbers up above 200MB.
> >>
> >>I tried dropping VIRTIO_PCI_QUEUE_MAX down to 32 (it didn't seem to work at
> >>all for values much less than that, though I didn't bother getting an exact
> >>value) and it didn't really make any difference, I saw inflight values as
> >>high as 177.
> 
> I think I might have a glimmering of what's going on.  Someone please
> correct me if I get something wrong.
> 
> I think that VIRTIO_PCI_QUEUE_MAX doesn't really mean anything with respect
> to max inflight operations, and neither does virtio-blk calling
> virtio_add_queue() with a queue size of 128.
> 
> I think what's happening is that virtio_blk_handle_output() spins, pulling
> data off the 128-entry queue and calling virtio_blk_handle_request().  At
> this point that queue entry can be reused, so the queue size isn't really
> relevant.

The number of pending virtio-blk requests is finite.  You missed the
vring descriptor table where buffer descriptors live - that's what
prevents the guest from issuing an infinite number of pending requests.

You are correct that the host moves along the "avail" queue, the actual
buffer descriptors in the vring (struct vring_desc) stay put until
request completion is processed by the guest driver from the "used"
ring.

Each virtio-blk request takes at least 2 vring descriptors (data buffer
+ request status byte).  I think 3 is common in practice because drivers
like to submit struct virtio_blk_outhdr in its own descriptor.

So we have a limit of 128 / 2 = 64 I/O requests or 128 / 3 = 42 I/O
requests.

If you rerun the tests with the fio job file I posted, the results
should show that only 64 or 42 requests are pending at any given time.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]