[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-t

From: Alexander Graf
Subject: Re: [Qemu-devel] Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-traveling
Date: Thu, 18 Feb 2010 11:30:48 +0100

On 18.02.2010, at 06:57, OHMURA Kei wrote:

>>>>>> "We think"? I mean - yes, I think so too. But have you actually measured 
>>>>>> it?
>>>>>> How much improvement are we talking here?
>>>>>> Is it still faster when a bswap is involved?
>>>>> Thanks for pointing out.
>>>>> I will post the data for x86 later.
>>>>> However, I don't have a test environment to check the impact of bswap.
>>>>> Would you please measure the run time between the following section if 
>>>>> possible?
>>>> It'd make more sense to have a real stand alone test program, no?
>>>> I can try to write one today, but I have some really nasty important bugs 
>>>> to fix first.
>>> OK.  I will prepare a test code with sample data.  Since I found a ppc 
>>> machine around, I will run the code and post the results of
>>> x86 and ppc.
>>> By the way, the following data is a result of x86 measured in QEMU/KVM.  
>>> This data shows, how many times the function is called (#called), runtime 
>>> of original function(orig.), runtime of this patch(patch), speedup ratio 
>>> (ratio).
>> That does indeed look promising!
>> Thanks for doing this micro-benchmark. I just want to be 100% sure that it 
>> doesn't affect performance for big endian badly.
> I measured runtime of the test code with sample data.  My test environment 
> and results are described below.
> x86 Test Environment:
> CPU: 4x Intel Xeon Quad Core 2.66GHz
> Mem size: 6GB
> ppc Test Environment:
> CPU: 2x Dual Core PPC970MP
> Mem size: 2GB
> The sample data of dirty bitmap was produced by QEMU/KVM while the guest OS
> was live migrating.  To measure the runtime I copied cpu_get_real_ticks() of
> QEMU to my test program.
> Experimental results:
> Test1: Guest OS read 3GB file, which is bigger than memory.       orig.(msec) 
>    patch(msec)    ratio
> x86    0.3            0.1            6.4 ppc    7.9            2.7            
> 3.0 
> Test2: Guest OS read/write 3GB file, which is bigger than memory.       
> orig.(msec)    patch(msec)    ratio
> x86    12.0           3.2            3.7 ppc    251.1          123            
> 2.0 
> I also measured the runtime of bswap itself on ppc, and I found it was only 
> just 0.3% ~ 0.7 % of the runtime described above. 

Awesome! Thank you so much for giving actual data to make me feel comfortable 
with it :-).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]