Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap paramet

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap paramet

From:	Marcelo Tosatti
Subject:	Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap parameter to do_kvm_cpu_synchronize_state
Date:	Wed, 16 Jan 2013 21:01:58 -0200
User-agent:	Mutt/1.5.21 (2010-09-15)

On Wed, Jan 16, 2013 at 09:41:54PM +0100, Christian Borntraeger wrote:
> On 16/01/13 21:21, Marcelo Tosatti wrote:
> > On Wed, Jan 16, 2013 at 09:03:20PM +0100, Christian Borntraeger wrote:
> >> On 16/01/13 17:05, Marcelo Tosatti wrote:
> >>
> >>> The S/390 problem, from
> >>> http://lists.nongnu.org/archive/html/qemu-devel/2012-11/msg02213.html:
> >>>
> >>> ">>> The kvm register sync needs to happen in the kvm register sync
> >>>>>> function :)
> >>>>> That would eliminate the whole purpose of sync regs and forces us to
> >>>>> have an
> >>>>> expensive ioctl on lots of exits (again). I would prefer to sync the 
> >>>>> registers
> >>>>> that we never need in qemu just here.
> >>>>
> >>>> That's why the register sync has different stages.
> >>>
> >>> Not the get_register. Which is called on every synchronize_state. Which
> >>> happen 
> >>> quite often
> >>> on s390."
> >>>
> >>> But wait: on these S/390 codepaths, you do GET_REGS already, via
> >>> cpu_synchronize_state.
> >>>
> >>> So on S/390
> >>>
> >>> - cpu_synchronize_state(env)
> >>> - read any register from env
> >>>
> >>> Is not valid? This is what generic code assumes.
> >>
> >> TO recap the motiviation:
> >>
> >> cpu_synchronize_state on s390 currently updates any register in env that is
> >> used by qemu (general purpose, prefix, psw, control and access) in the 
> >> normal
> >> runtime. it turns out we have all of these regs in kvm_run, so we can do 
> >> synchronize states without doing an additional ioctl call.
> >> Now, for life migration and dump we need some additional registers (which 
> >> are
> >> only accessable via onereg interface). So synchronize_state would need to
> >> do 3 or 4 additional system calls on the hot path, only to take care of 
> >> something that is not on the hot path at all.
> >> For historic reasons, we have one exit code for almost all exits. 
> >> Therefore,
> >> we need to call synchronize_states almost always.
> >> We could now start to have a poor mans synchronize_state in arch code, but
> >> that would collide with common code synchronize_state if done at the wrong
> >> time. Thus we want to make common code capable of having only a subset of
> >> the register synched - by making it possible to sync the other regs later
> >> on if needed without wiping the former sync.
> >>
> >> Makes sense?
> >>
> >> Christian
> > 
> > Yes. As noted in the last email on the thread, runtime/reset/full are to
> > serapate sets of registers when writing _to_ kernel. When reading _from_
> > kernel, reset and full distinctions are not appropriate (any register
> > can change, as far as knowledge goes).
> 
> Hmm, I probably did not understood your point, so I will try to explain mine
> and see what you respond :-)
> 
> The point of the patch set, is to allow this distinction when reading. 
> In other words it allows code to state: I am only interested in regxy and dont
> care if the other regs in env are out of sync.

Fine.

> If a full sync is necessary later on the other regs are synched as well.
> If a full sync was already done before a partial get becomes a no-op.

-> FULL is the set of registers written when loadvm/initialization is
performed.
-> RESET, a subset of full, is a set of registers written on SYSTEM
RESET.
-> RUNTIME, a subset of RESET, is a set of registers written during
RUNTIME.

To write both the RESET and FULL set of registers during runtime,
contradicts the description above for both RESET and FULL.

Two examples from i386:

    if (level == KVM_PUT_FULL_STATE) {
        /*
         * KVM is yet unable to synchronize TSC values of multiple VCPUs
         * on
         * writeback. Until this is fixed, we only write the offset to
         * SMP
         * guests after migration, desynchronizing the VCPUs, but
         * avoiding
         * huge jump-backs that would occur without any writeback at
         * all.
         */
        ...
    }

And:

    /*
     * The following paravirtual MSRs have side effects on the guest or
     * are
     * too heavy for normal writeback. Limit them to reset or full state
     * updates.
     */

> Why should that be not possible.

It should, but separately from FULL/RESET/RUNTIME distinction.
This sequence

        get_regs(FULLSTATE)
        put_regs(FULLSTATE)

During runtime is not allowed. And only syncing the RUNTIME set of
registers during and leaving the FULL set of registers marked as
dirty is confusing also. 

So perhaps what you'd want is selective read/write of RUNTIME registers
as suggested.


Date: Fri, 4 Jan 2013 23:49:42 -0200
From: Marcelo Tosatti <address@hidden>
To: "Jason J. Herne" <address@hidden>
Cc: Alexander Graf <address@hidden>,
        Bhushan Bharat-R65777 <address@hidden>,
        Christian Borntraeger <address@hidden>,
        Anthony Liguori <address@hidden>,
        "address@hidden qemu-devel" <address@hidden>
Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix
 do_kvm_cpu_synchronize_state data integrity issue

On Fri, Jan 04, 2013 at 10:25:45AM -0500, Jason J. Herne wrote:
> If I've followed the conversation correctly this is what needs to be done:
> 
> 1. Remove the level parameters from kvm_arch_get_registers and
> kvm_arch_put_registers.
> 
> 2. Add a new bitmap parameter to kvm_arch_get_registers and
> kvm_arch_put_registers.
> 
> 3. Define a bit that correlates to our current notion of "all
> runtime registers".  This bit, and all bits in this bitmap, would be
> architecture specific.
> 
> 4. Remove the cpustate->kvm_sync_dirty field.  Replace it with a
> bitmap that tracks which bits are dirty and need to be synced back
> to KVM-land.
> 
> 5. As we do today, we'll assume registers are dirty and turn on
> their corresponding bit in this new bitmap whenever we "get" the
> registers from KVM.
> 
> 6. Add other bits as needed on a case by case basis.
> 
> Does this seem to match what was discussed, and what we want to do?
> 
> 
> -- 
> -- Jason J. Herne (address@hidden)

The s390 change to use runtime_state is self-contained and could be
integrated now from my pov.

The suggestion was based on the observation that the ppc guys are trying 
to fix their timer problem and that should be a generic solution,
per-register granularity.

Perhaps something along the lines of 

init
all registers marked as cached (read/write accessors don't 
synchronize state)

Around ioctl(KVM_RUN) have:

if regs dirty
        writeback
r = ioctl(KVM_RUN)
mark runstate registers as not cached

Then

read_reg(x)
        if x not cached
                arch_get_regs(RUNTIME_STATE) (*)

write_reg(x, val)
        read_reg(x)
        cpustate->x = val;
        mark_dirty(x)

Which is basically the pattern used in KVM x86 (but instead of
ioctl(KVM_RUN) there is VMENTRY).

But talk is cheap, how that would work with RESET/FULL states
is unclear. Jan?

For those archs that need, of course, (*) would use ONE_REG or whatever
other interface. For x86, or those archs that are not converted, keep
using kvm_arch_get_regs/kvm_arch_put_regs.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap parameter to do_kvm_cpu_synchronize_state, (continued)

Prev by Date: Re: [Qemu-devel] [RFC V1 10/12] qerror: Add QERR_DEVICE_NOT_DEDUPLICATED.
Next by Date: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel61
Previous by thread: Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap parameter to do_kvm_cpu_synchronize_state
Next by thread: Re: [Qemu-devel] [PATCH 4/7 v2] KVM regsync: Add register bitmap parameter to do_kvm_cpu_synchronize_state
Index(es):
- Date
- Thread