Re: [Qemu-devel] 答复: [PATCH 0/5] mc146818rtc: fix Windows VM clock fast

From: Hailiang Zhang
Subject: Re: [Qemu-devel] 答复: [PATCH 0/5] mc146818rtc: fix Windows VM clock faster
Date: Wed, 19 Apr 2017 18:41:43 +0800
The root cause is that the clock will be lost if the periodic
is changed as currently code counts the next periodic time like
         next_irq_clock = (cur_clock & ~(period - 1)) + period;

consider the case if cur_clock = 0x11FF and period = 0x100, then
next_irq_clock is 0x1200, however, there is only 1 clock left to
trigger the next irq. Unfortunately, Windows guests (at least
Windows7) change the period very frequently if it runs the attached
code, so that the lost clock is accumulated, the wall-time become
faster and faster
Very interesting.

Yes, indeed.

However, I think that the above should be exactly how the RTC should
work.  The original RTC circuit had 22 divider stages (see page
13 of
the datasheet[1], at the bottom right), and the periodic interrupt
taps the rising edge of one of the dividers (page 16, second
paragraph).  The datasheet also never mentions a comparator being
used to trigger the periodic interrupts.

That was my thought before, however, after more test, i am not
sure if
re-configuring RegA changes these divider stages internal...

Have you checked that this Windows bug doesn't happen on real
hardware too?  Or is the combination of driftfix=slew and changing
periods that is a problem?

I have two physical windows 7 machines, both of them have
'useplatformclock = off' and ntp disabled, the wall time is really
accurate. The difference is that the physical machines are using
Q87 LPC chipset which is mc146818rtc compatible. However, on VM, the
issue is easily be reproduced just in ~10 mins.

Our test mostly focus on 'driftfix=slew' and after this patchset the
time is accurate and stable.

I will do the test for dropping 'slew' and see what will happen...

Well, the time is easily observed to be faster if 'driftfix=slew' is
not used. :(
You mean, it only fixes the one case which with the ' driftfix=slew '
is used ?
No. for both.

We encountered this problem too, I have tried to fix it long time ago.
(It seems that your solution is more useful)
But it seems that it is impossible to fix, we need to emulate the
behaviors of real hardware,
but we didn't find any clear description about it. And it seems that
other virtualization platforms
That is the issue, the hardware spec does not detail how the clock is
counted when the timer interval is changed. What we can do at this time
is that speculate it from the behaviors. Current RTC is completely
unusable anyway.

have this problem too:

Hmm, slower clock is understandable, does really the Windows7 on hyperV
have faster clock? Did you meet it?
I don't know, we didn't test it, besides, I'd like to know how long did
your testcase run before
you judge it is stable with 'driftfix=slew'  option? (My previous patch
can't fix it completely but
only narrows the gap between timer in guest and real timer.)
More than 12 hours.
Great, I'll test and look into it ... thanks.

Hi Hailiang,

Does this patchset work for you? :)

Yes, i think it works for us, nice work :)


