qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 2/2] pc: memhp: force gaps between DIMM's GPA
Date: Sun, 27 Sep 2015 17:18:48 +0300

On Sun, Sep 27, 2015 at 04:04:06PM +0200, Igor Mammedov wrote:
> On Sun, 27 Sep 2015 16:11:02 +0300
> "Michael S. Tsirkin" <address@hidden> wrote:
> 
> > On Sun, Sep 27, 2015 at 03:06:24PM +0200, Igor Mammedov wrote:
> > > On Sun, 27 Sep 2015 13:48:21 +0300
> > > "Michael S. Tsirkin" <address@hidden> wrote:
> > > 
> > > > On Fri, Sep 25, 2015 at 03:53:12PM +0200, Igor Mammedov wrote:
> > > > > mapping DIMMs non contiguously allows to workaround
> > > > > virtio bug reported earlier:
> > > > > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > > > > in this case guest kernel doesn't allocate buffers
> > > > > that can cross DIMM boundary keeping each buffer
> > > > > local to a DIMM.
> > > > > 
> > > > > Suggested-by: Michael S. Tsirkin <address@hidden>
> > > > > Signed-off-by: Igor Mammedov <address@hidden>
> > > > > ---
> > > > > benefit of this workaround is that no guest side
> > > > > changes are required.
> > > > 
> > > > That's a hard requirement, I agree.
> > > > 
> > > > 
> > > > > ---
> > > > >  hw/i386/pc.c         | 4 +++-
> > > > >  hw/i386/pc_piix.c    | 3 +++
> > > > >  hw/i386/pc_q35.c     | 3 +++
> > > > >  include/hw/i386/pc.h | 2 ++
> > > > >  4 files changed, 11 insertions(+), 1 deletion(-)
> > > > 
> > > > Aren't other architectures besides PC ever affected?
> > > > Do they all allocate all of memory contigious in HVA space?
> > > I'm not sure about other targets I've CCed interested parties.
> > > 
> > > > 
> > > > Also - does the issue only affect hotplugged memory?
> > > Potentially it affects -numa memdev=foo, but however I've
> > > tried I wasn't able to reproduce.
> > > We could do it as
> > > separate workaround later if it would affect someone
> > > and virtio is not fixed to handle split buffers by that time.
> > > 
> > 
> > You can't reproduce a crash or you can't reproduce getting contigious
> > GPA with fragmented HVA?
> > If you can see fragmentation that's enough to assume guest crash can
> > be triggered, even if it doesn't with Linux.
> I'll check it.
> 
> > 
> > >  
> > > > Can't the patch be local to pc-dimm (except maybe the
> > > > backwards compatibility thing)?
> > > I think decision about using gaps and its size
> > > should be done by board and not generic pc-dimm.
> > > 
> > 
> > Well virtio is generic and can be used by all boards.
> Still pc-dimm.addr is not allocation is not part of pc-dimm
> device. it's just helper functions that happen to live in
> the same file source file.
> 
> But more importantly every target might have it's own
> notion how it partitions hotplug address space so making
> the same gap global might break them.

That's why each target has it's own alignment, no?

> It's safer to enable gaps per target, I think ppc guys
> will make their own patch on top of this to taking
> in account their target specific and compat stuff.

Sure, it can be split up but we need to address at least
the kvm platforms.

> > 
> > 
> > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > > index 91d134c..c462c4e 100644
> > > > > --- a/hw/i386/pc.c
> > > > > +++ b/hw/i386/pc.c
> > > > > @@ -1629,6 +1629,7 @@ static void pc_dimm_plug(HotplugHandler 
> > > > > *hotplug_dev,
> > > > >      HotplugHandlerClass *hhc;
> > > > >      Error *local_err = NULL;
> > > > >      PCMachineState *pcms = PC_MACHINE(hotplug_dev);
> > > > > +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> > > > >      PCDIMMDevice *dimm = PC_DIMM(dev);
> > > > >      PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > > > >      MemoryRegion *mr = ddc->get_memory_region(dimm);
> > > > > @@ -1644,7 +1645,8 @@ static void pc_dimm_plug(HotplugHandler 
> > > > > *hotplug_dev,
> > > > >          goto out;
> > > > >      }
> > > > >  
> > > > > -    pc_dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align, 0, 
> > > > > &local_err);
> > > > > +    pc_dimm_memory_plug(dev, &pcms->hotplug_memory, mr, align,
> > > > > +                        pcmc->inter_dimm_gap, &local_err);
> > > > >      if (local_err) {
> > > > >          goto out;
> > > > >      }
> > > > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > > > index 3ffb05f..3165667 100644
> > > > > --- a/hw/i386/pc_piix.c
> > > > > +++ b/hw/i386/pc_piix.c
> > > > > @@ -457,11 +457,13 @@ static void pc_xen_hvm_init(MachineState 
> > > > > *machine)
> > > > >  
> > > > >  static void pc_i440fx_machine_options(MachineClass *m)
> > > > >  {
> > > > > +    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > > > >      m->family = "pc_piix";
> > > > >      m->desc = "Standard PC (i440FX + PIIX, 1996)";
> > > > >      m->hot_add_cpu = pc_hot_add_cpu;
> > > > >      m->default_machine_opts = "firmware=bios-256k.bin";
> > > > >      m->default_display = "std";
> > > > > +    pcmc->inter_dimm_gap = PC_2MB_DIMM_GAP;
> > > > >  }
> > > > >  
> > > > >  static void pc_i440fx_2_5_machine_options(MachineClass *m)
> > > > > @@ -482,6 +484,7 @@ static void 
> > > > > pc_i440fx_2_4_machine_options(MachineClass *m)
> > > > >      m->alias = NULL;
> > > > >      m->is_default = 0;
> > > > >      pcmc->broken_reserved_end = true;
> > > > > +    pcmc->inter_dimm_gap = 0;
> > > > >      SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > > > >  }
> > > > >  
> > > > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > > > > index 1b7d3b6..8ad6687 100644
> > > > > --- a/hw/i386/pc_q35.c
> > > > > +++ b/hw/i386/pc_q35.c
> > > > > @@ -360,6 +360,7 @@ static void pc_compat_1_4(MachineState *machine)
> > > > >  
> > > > >  static void pc_q35_machine_options(MachineClass *m)
> > > > >  {
> > > > > +    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > > > >      m->family = "pc_q35";
> > > > >      m->desc = "Standard PC (Q35 + ICH9, 2009)";
> > > > >      m->hot_add_cpu = pc_hot_add_cpu;
> > > > > @@ -368,6 +369,7 @@ static void pc_q35_machine_options(MachineClass 
> > > > > *m)
> > > > >      m->default_display = "std";
> > > > >      m->no_floppy = 1;
> > > > >      m->no_tco = 0;
> > > > > +    pcmc->inter_dimm_gap = PC_2MB_DIMM_GAP;
> > > > >  }
> > > > >  
> > > > >  static void pc_q35_2_5_machine_options(MachineClass *m)
> > > > > @@ -385,6 +387,7 @@ static void 
> > > > > pc_q35_2_4_machine_options(MachineClass *m)
> > > > >      pc_q35_2_5_machine_options(m);
> > > > >      m->alias = NULL;
> > > > >      pcmc->broken_reserved_end = true;
> > > > > +    pcmc->inter_dimm_gap = 0;
> > > > >      SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> > > > >  }
> > > > >  
> > > > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > > > > index 6896328..dd6b34a 100644
> > > > > --- a/include/hw/i386/pc.h
> > > > > +++ b/include/hw/i386/pc.h
> > > > > @@ -50,6 +50,7 @@ struct PCMachineState {
> > > > >  #define PC_MACHINE_SMM              "smm"
> > > > >  #define PC_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> > > > >  
> > > > > +#define PC_2MB_DIMM_GAP (1ULL << 21)
> > > > >  /**
> > > > >   * PCMachineClass:
> > > > >   * @get_hotplug_handler: pointer to parent class callback 
> > > > > @get_hotplug_handler
> > > > 
> > > > Seems somewhat arbitrary. It's aligned later - so won't a 1 byte gap be 
> > > > enough?
> > > 1 byte should be also enough, since effectively it would kick alignment 
> > > adjustment.
> > > 
> > > The reason why I've picked 2Mb is that QEMU ram allocator allocates
> > > 2Mb granularity.
> > 
> > It will align it itself. We don't want that logic to spread out to
> > unrelated parts of QEMU.
> Ok, I'll make it boolean and just use hardcoded 1
> in pc_dimm_plug()->pc_dimm_memory_plug(...,1,...)
> 
> > 
> > > > 
> > > > > @@ -60,6 +61,7 @@ struct PCMachineClass {
> > > > >  
> > > > >      /*< public >*/
> > > > >      bool broken_reserved_end;
> > > > > +    uint64_t inter_dimm_gap;
> > > > >      HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
> > > > >                                             DeviceState *dev);
> > > > >  };
> > > > > -- 
> > > > > 1.8.3.1



reply via email to

[Prev in Thread] Current Thread [Next in Thread]