On Wed, Sep 12, 2012 at 06:27:14PM +0200, Stefan Weil wrote:
Am 12.09.2012 15:54, schrieb Anthony Liguori:
Hi,
We've been running into a lot of problems lately with Windows guests and
I think they all ultimately could be addressed by revisiting the missed
tick catchup algorithms that we use. Mike and I spent a while talking
about it yesterday and I wanted to take the discussion to the list to
get some additional input.
Here are the problems we're seeing:
1) Rapid reinjection can lead to time moving faster for short bursts of
time. We've seen a number of RTC watchdog BSoDs and it's possible
that at least one cause is reinjection speed.
2) When hibernating a host system, the guest gets is essentially paused
for a long period of time. This results in a very large tick catchup
while also resulting in a large skew in guest time.
I've gotten reports of the tick catchup consuming a lot of CPU time
from rapid delivery of interrupts (although I haven't reproduced this
yet).
3) Windows appears to have a service that periodically syncs the guest
time with the hardware clock. I've been told the resync period is an
hour. For large clock skews, this can compete with reinjection
resulting in a positive skew in time (the guest can be ahead of the
host).
Nearly each modern OS (including Windows) uses NTP
or some other protocol to get the time via a TCP network.
The drifts we are talking about will take ages for NTP to fix.
If a guest OS detects a small difference of time, it will usually
accelerate or decelerate the OS clock until the time is
synchronised again.
Large jumps in network time will make the OS time jump, too.
With a little bad luck, QEMU's reinjection will add the
positive skew, no matter whether the guest is Linux or Windows.
As far as I know NTP will never make OS clock jump. The purpose of NTP
is to fix time gradually, so apps will not notice. npdate is used to
force clock synchronization, but is should be run manually.