qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Verita


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
Date: Tue, 8 Nov 2016 15:39:15 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

On Mon, Nov 07, 2016 at 08:27:39PM +0000, Ketan Nilangekar wrote:
> On 11/7/16, 2:22 AM, "Stefan Hajnoczi" <address@hidden> wrote:
> >On Fri, Nov 04, 2016 at 06:30:47PM +0000, Ketan Nilangekar wrote:
> >> > On Nov 4, 2016, at 2:52 AM, Stefan Hajnoczi <address@hidden> wrote:
> >> >> On Thu, Oct 20, 2016 at 01:31:15AM +0000, Ketan Nilangekar wrote:
> >> >> 2. The idea of having multi-threaded epoll based network client was to 
> >> >> drive more throughput by using multiplexed epoll implementation and 
> >> >> (fairly) distributing IOs from several vdisks (typical VM assumed to 
> >> >> have atleast 2) across 8 connections. 
> >> >> Each connection is serviced by single epoll and does not share its 
> >> >> context with other connections/epoll. All memory pools/queues are in 
> >> >> the context of a connection/epoll.
> >> >> The qemu thread enqueues IO request in one of the 8 epoll queues using 
> >> >> a round-robin. Responses are also handled in the context of an epoll 
> >> >> loop and do not share context with other epolls. Any synchronization 
> >> >> code that you see today in the driver callback is code that handles the 
> >> >> split IOs which we plan to address by a) implementing readv in libqnio 
> >> >> and b) removing the 4MB limit on write IO size.
> >> >> The number of client epoll threads (8) is a #define in qnio and can 
> >> >> easily be changed. However our tests indicate that we are able to drive 
> >> >> a good number of IOs using 8 threads/epolls.
> >> >> I am sure there are ways to simplify the library implementation, but 
> >> >> for now the performance of the epoll threads is more than satisfactory.
> >> > 
> >> > By the way, when you benchmark with 8 epoll threads, are there any other
> >> > guests with vxhs running on the machine?
> >> > 
> >> 
> >> Yes. Infact the total througput with around 4-5 VMs scales well to 
> >> saturate around 90% of available storage throughput of a typical pcie ssd 
> >> device.
> >> 
> >> > In a real-life situation where multiple VMs are running on a single host
> >> > it may turn out that giving each VM 8 epoll threads doesn't help at all
> >> > because the host CPUs are busy with other tasks.
> >> 
> >> The exact number of epolls required to achieve optimal throughput may be 
> >> something that can be adjusted dynamically by the qnio library in 
> >> subsequent revisions. 
> >> 
> >> But as I mentioned today we can change this by simply rebuilding qnio with 
> >> a different value for the #define
> >
> >In QEMU there is currently work to add multiqueue support to the block
> >layer.  This enables true multiqueue from the guest down to the storage
> >backend.
> 
> Is there any spec or documentation on this that you can point us to?

The current status is:

1. virtio-blk and virtio-scsi support multiple queues but these queues
   are processed from a single thread today.

2. MemoryRegions can be marked with !global_locking so its handler
   functions are dispatched without taking the QEMU global mutex.  This
   allows device emulation to run in multiple threads.

3. Paolo Bonzini (CCed) is currently working on make the block layer
   (BlockDriverState and co) support access from multiple threads and
   multiqueue.  This is work in progress.

If you are interested in this work keep an eye out for patch series from
Paolo Bonzini and Fam Zheng.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]