qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Performance problem and improvement about block drive o


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] Performance problem and improvement about block drive on NFS shares with libnfs
Date: Thu, 6 Apr 2017 11:40:13 +0100
User-agent: Mutt/1.8.0 (2017-02-23)

On Sat, Apr 01, 2017 at 01:23:46PM +0800, Jaden Liang wrote:
> Hello,
> 
> I ran qemu with drive file via libnfs recently, and found some performance
> problem and an improvement idea.
> 
> I started qemu with 6 drives parameter like 
> nfs://127.0.0.1/dir/vm-disk-x.qcow2
> which linked to a local NFS server, then used iometer in guest machine to test
> the 4K random read or random write IO performance. I found that while the IO
> depth go up, the IOPS hit a bottleneck. I looked into the causes, found that 
> the
> main thread of qemu used 100% CPU. From the perf data, it show the CPU heats 
> are
> send / recv calls in libnfs. By reading the source code of libnfs and qemu 
> block
> drive of nfs.c, libnfs only support single work thread, and the network events
> of nfs interface in qemu are all registered in the epoll of main thread. That 
> is
> the cause why main thread uses 100% CPU.
> 
> After the analysis above, there is an improvement idea comes up. I start a
> thread for every drive while libnfs open drive file, then create an epoll in
> every drive thread to handle all of the network events. I have finished an 
> demo
> modification in block/nfs.c, then rerun iometer in the guest machine, the
> performance increased a lot. Random read IOPS increases almost 100%, random
> write IOPS increases about 68%.
> 
> Test model details
> VM configure: 6 vdisks in 1 VM
> Test tool and parameter: iometer with 4K random read and randwrite
> Backend physical drive: 2 SSDs, 6 vdisks are seperated in 2 SSDs
> 
> Before modified:
> IO Depth           1        2          4           8       16         32
> 4K randread  16659  28387   42932   46868   52108   55760
> 4K randwrite  12212   19456   30447   30574   35788   39015
> 
> After modified:
> IO Depth            1         2          4          8        16          32
> 4K randread   17661   33115   57138   82016   99369   109410
> 4K randwrite  12669   21492   36017   51532   61475   65577
> 
> I could put a up to coding standard patch later. Now I want to get some advise
> about this modification. Is this a reasonable solution to improve performance 
> in
> NFS shares? Or there is another better way?
> 
> Any suggestions would be great! Also please feel free to ask question.

Did you try using -object iothread,id=iothread1 -device
virtio-blk-pci,iothread=iothread1,... to define IOThreads for each
virtio-blk-pci device?

The block/nfs.c code already supports IOThread so you can run multiple
threads and don't need to use 100% CPU in the main loop.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]