[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU throughput is down with SMP

From: Venkateswararao Jujjuri (JV)
Subject: Re: [Qemu-devel] QEMU throughput is down with SMP
Date: Fri, 01 Oct 2010 08:04:40 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20100915 Thunderbird/3.1.4

On 10/1/2010 6:38 AM, Ryan Harper wrote:
* Stefan Hajnoczi<address@hidden>  [2010-10-01 03:48]:
On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
<address@hidden>  wrote:
On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:

On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
<address@hidden>    wrote:

Code: Mainline QEMU (git://git.qemu.org/qemu.git)
Machine: LS21 blade.
Disk: Local disk through VirtIO.
Did not select any cache option. Defaulting to writethrough.

Command tested:
3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k

QEMU with  smp=1
19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s

QEMU with smp=4
15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s

Is this expected?

Did you configure with --enable-io-thread?

Yes I did.

Also, try using dd oflag=direct to eliminate effects introduced by the
guest page cache and really hit the disk.

With oflag=direct , I see no difference and the throughput is so slow and I
would not
expect to see any difference.
It is 225 kb/s  for each thread either with smp=1 or with smp=4.

If I understand correctly you are getting:

QEMU oflag=direct with smp=1
225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s

QEMU oflag=direct with smp=4
225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s

This suggests the degradation for smp=4 is guest kernel page cache or
buffered I/O related.  Perhaps lockholder preemption?

or just a single spindle maxed out because the blade hard drive doesn't
have writecache enabled (it's disabled by default).

Yes, I am sure we are hitting the max limit on the blade local disk.
Question is why the smp=4 degraded the performance in the cached mode.

I am running latest kernel from upstream on the guest(2.6.36-rc5)..and using block IO.
Do we have any know issues in there which could explain performance degradation?

I am trying to get to a test which proves that the QEMU SMP improves/scales.
I would like to use it in validating our new VirtFS threading code (yet to hit mailing list).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]