qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Testing migration under stress


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] Testing migration under stress
Date: Tue, 06 Nov 2012 16:22:11 +1100
User-agent: Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/20121010 Thunderbird/16.0.1

On 02/11/12 23:12, Orit Wasserman wrote:
On 11/02/2012 05:10 AM, David Gibson wrote:
Asking for some advice on the list.

I have prorotype savevm and migration support ready for the pseries
machine.  They seem to work under simple circumstances (idle guest).
To test them more extensively I've been attempting to perform live
migrations (just over tcp->localhost) which the guest is active with
something.  In particular I've tried while using octave to do matrix
multiply (so exercising the FP unit) and my colleague Alexey has tried
during some video encoding.

As you are doing local migration one option is to setting the speed higher
than line speed , as we don't actually send the data, another is to set high 
downtime.

However, in each of these cases, we've found that the migration only
completes and the source instance only stops after the intensive
workload has (just) completed.  What I surmise is happening is that
the workload is touching memory pages fast enough that the ram
migration code is never getting below the threshold to complete the
migration until the guest is idle again.

The workload you chose is really bad for live migration, as all the guest does 
is
dirtying his memory. I recommend looking for workload that does some networking 
or disk IO.
Vinod succeeded running SwingBench and SLOB benchmarks that converged ok, I 
don't
know if they run on pseries, but similar workload should be ok(small 
database/warehouse).
We found out that SpecJbb on the other hand is hard to converge.
Web workload or video streaming also do the trick.


My ffmpeg workload is simple encoding h263+ac3 to h263+ac3, 64*36 pixels. So it should not be dirtying memory too much. Or is it?

(qemu) info migrate
capabilities: xbzrle: off
Migration status: completed
total time: 14538 milliseconds
downtime: 1273 milliseconds
transferred ram: 389961 kbytes
remaining ram: 0 kbytes
total ram: 1065024 kbytes
duplicate: 181949 pages
normal: 97446 pages
normal bytes: 389784 kbytes

How many bytes were actually transferred? "duplicate" * 4K = 745MB?

Is there any tool in QEMU to see how many pages are used/dirty/etc?
"info" does not seem to have any kind of such statistic.

btw the new guest did not resume (qemu still responds on commands) but this is probably our problem within "pseries" platform. What is strange is that "info migrate" on the new guest shows nothing:

(qemu) info migrate
(qemu)




Cheers,
Orit

Does anyone have some ideas for testing this better: workloads that
are less likely to trigger this behaviour, or settings to tweak in the
migration itself to make it more likely to complete migration while
the workload is still active.




--
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]