[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] 答复: Re: [PATCH] vhost: fix a migration failed because
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] 答复: Re: [PATCH] vhost: fix a migration failed because ofvhost region merge |
Date: |
Sat, 22 Jul 2017 02:50:39 +0300 |
On Thu, Jul 20, 2017 at 10:57:57AM +0800, address@hidden wrote:
> 原始邮件
> 发件人: <address@hidden>;
> 收件人: <address@hidden>;
> 抄送人: <address@hidden>; <address@hidden>;
> <address@hidden>;彭浩10096742;王业超10154425;
> <address@hidden>;
> 日期:2017年07月19日 23:53
> 主题:Re: [Qemu-devel] [PATCH] vhost: fix a migration failed because ofvhost
> region merge
>
>
> On Wed, Jul 19, 2017 at 03:24:27PM +0200, Igor Mammedov wrote:
> > On Wed, 19 Jul 2017 12:46:13 +0100
> > "Dr. David Alan Gilbert" <address@hidden> wrote:
> >
> > > * Igor Mammedov (address@hidden) wrote:
> > > > On Wed, 19 Jul 2017 23:17:32 +0800
> > > > Peng Hao <address@hidden> wrote:
> > > >
> > > > > When a guest that has several hotplugged dimms is migrated, in
> > > > > destination host it will fail to resume. Because vhost regions of
> > > > > several dimms in source host are merged and in the restore stage
> > > > > in destination host it computes whether more than vhost slot limit
> > > > > before merging vhost regions of several dimms.
> > > > could you provide a bit more detailed description of the problem
> > > > including command line+used device_add commands on source and
> > > > command line on destination?
> > >
> > > (ccing in Marc Andre and Maxime)
> > >
> > > Hmm, I'd like to understade the situation where you get merging between
> > > RAMBlocks; that complicates some stuff for postcopy.
> > and probably inconsistent merging breaks vhost as well
> >
> > merging might happen if regions are adjacent or overlap
> > but for that to happen merged regions must have equal
> > distance between their GPA:HVA pairs, so that following
> > translation would work:
> >
> > if gva in regionX[gva_start, len, hva_start]
> > hva = hva_start + gva - gva_start
> >
> > while GVA of regions is under QEMU control and deterministic
> > HVA is not, so in migration case merging might happen on source
> > side but not on destination, resulting in different memory maps.
> >
> > Maybe Michael might know details why migration works in vhost usecase,
> > but I don't see vhost sending any vmstate data.
>
> We aren't merging ramblocks at all.
> When we are passing blocks A and B to vhost, if we see that
>
> hvaB=hvaA + lenA
> gpaB=gpaA + lenA
>
> then we can improve performance a bit by passing a single
> chunk to vhost: hvaA,gpaA,lena+lenB
>
>
> so it does not affect migration normally.
>
> ----- I think it is like this:
>
> in source; in destination:(restore)
>
> realize device 1 realize device 1
>
> realize device 2 realize dimm 0
>
> ... realize dimm1
>
> ....
>
> realize device n realize dimmx
>
> realize device m
>
> realize dimm0 .....
>
> realize dimm1 .....
>
> ...... .....
>
> realize dimmx realize device n
>
>
> In restore stage ,the sort of realizing device is different from starting vm
> because of adding dimms.
>
> So it may in some stage during restoring can't merge vhost regions.
If you run over the number of regions supported by vhost on destination
then you won't be able to start a VM there until you disable vhost.
>
>
>
>
>
> >
> > >
> > > > >
> > > > > Signed-off-by: Peng Hao <address@hidden>
> > > > > Signed-off-by: Wang Yechao <address@hidden>
> > > > > ---
> > > > > hw/mem/pc-dimm.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
> > > > > index ea67b46..bb0fa08 100644
> > > > > --- a/hw/mem/pc-dimm.c
> > > > > +++ b/hw/mem/pc-dimm.c
> > > > > @@ -101,7 +101,7 @@ void pc_dimm_memory_plug
> (DeviceState *dev, MemoryHotplugState *hpms,
> > > > > goto out;
> > > > > }
> > > > >
> > > > > - if (!vhost_has_free_slot()) {
> > > > > + if (!vhost_has_free_slot() && runstate_is_running()) {
> > > > > error_setg(&local_err, "a used vhost backend has no free"
> > > > > " memory slots left");
> > > > > goto out;
> > >
> > > Even this produces the wrong error message in this case,
> > > it also makes me think if the existing code should undo a lot of
> > > the object_property_set's that happen.
> > >
> > > Dave
> > > >
> > > >
> > > --
> > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
>
>
>