qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.


From: Frederic Konrad
Subject: Re: [Qemu-devel] [RFC 0/3] Determinitic behaviour with icount.
Date: Thu, 18 Jul 2013 18:31:19 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7

On 18/07/2013 17:35, Paolo Bonzini wrote:
Il 18/07/2013 17:06, Peter Maydell ha scritto:
On 18 July 2013 16:02,  <address@hidden> wrote:
As I said in the last email, we have issues with determinism with icount.
We are wondering if determinism is really ensured with icount?
My opinion is that it *should* be deterministic but it would
be unsurprising if the determinism had got broken along the way.
First of all, it can only be deterministic if the guest satisfies (at
least) all the following condition:

1) only uses timer that QEMU bases on vm_clock (which means that you
should use "-rtc clock=vm"---sorry Fred, didn't think about this in the
previous answer);

Oops sorry, I didn't mentioned that, but we used rtc clock=vm for our tests.
2) never does any network operation nor any asynchronous disk I/O operation

3) never halts the VCPU waiting for an interrupt


Point 1 is obvious.


To explain points 2, let's consider what happens if a block device uses
synchronous vs. asynchronous I/O.

With synchronous I/O, each block device operation will complete
immediately.  All clocks are stalled during the operation.

With asynchronous I/O, each block device operation will be done while
the CPU is running.  If the CPU is polling a completion flag, the number
of instructions executed (thus icount) depends on how long it takes to
do I/O.

So I suppose this can happen even if there are any network card or block device.

We probably need to disable it until we finally save and replay IO, to get this thing
working.



To explain point 3 (which is the only one that _might_ be fixable),
let's see what happens if the VCPU halts waiting for an interrupt.  If
that is the case, and you haven't done any asynchronous I/O, there
should be active vm_clock timers, and you have another possible source
of non-deterministic behavior.

The current QEMU behavior is (and has always been) to start tracking
rt_clock.  This is obviously not deterministic.  Note that with the
switch to separate threads for iothread/VCPU, the algorithm to do this
has become much better.  Let's look at a couple possibilities:

2) jump to the next vm_clock deadline.  This sounds appealing, but it is
still nondeterministic in the general case when the guest *is* doing
asynchronous I/O too.  How many vm_clock timers do you run before I/O
finishes?  Furthermore, the vm_clock might move too fast.  Think of an
RTC clock whose alarm registers are 0/0/0 so it fires at midnight; if it
is the only active vm_clock timer, you end up in 2107 even before the
kernel boots!

Yes I didn't think about that :).

3) do not process vm_clock timers at all unless there is no pending I/O
(block/network); if there is none, track rt_clock as in current
behavior.  I just made it up, but it sounds promising and similar to
synchronous I/O.  It should not be extremely hard to implement, and it
can remove this kind of nondeterminism.  But it won't fix the case when
the CPU is polling.

Thanks, I need to take a look at all this.

Paolo

ps: I'm not an expert on icount at all, I'm only reasoning of the
possible interactions with the main loop.

Both icount and reverse execution need an instruction counter. icount use a
count-down mechanism but reverse execution need a continuous counter. For now
we have build a separate counter and we think that these two counters can be
merged. However we would like feedback about this before modifying this.
I definitely think that there should only be one counter, not two.

thanks
-- PMM





reply via email to

[Prev in Thread] Current Thread [Next in Thread]