Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

From:	Nick Piggin
Subject:	Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device
Date:	Thu, 11 Mar 2010 15:37:08 +1100
User-agent:	Mutt/1.5.20 (2009-06-14)

On Thu, Mar 11, 2010 at 03:10:47AM +0000, Jamie Lokier wrote:
> Paul Brook wrote:
> > > > In a cross environment that becomes extremely hairy.  For example the 
> > > > x86
> > > > architecture effectively has an implicit write barrier before every
> > > > store, and an implicit read barrier before every load.
> > > 
> > > Btw, x86 doesn't have any implicit barriers due to ordinary loads.
> > > Only stores and atomics have implicit barriers, afaik.
> > 
> > As of March 2009[1] Intel guarantees that memory reads occur in
> > order (they may only be reordered relative to writes). It appears
> > AMD do not provide this guarantee, which could be an interesting
> > problem for heterogeneous migration..
> 
> (Summary: At least on AMD64, it does too, for normal accesses to
> naturally aligned addresses in write-back cacheable memory.)
> 
> Oh, that's interesting.  Way back when I guess we knew writes were in
> order and it wasn't explicit that reads were, hence smp_rmb() using a
> locked atomic.
> 
> Here is a post by Nick Piggin from 2007 with links to Intel _and_ AMD
> documents asserting that reads to cacheable memory are in program order:
> 
>     http://lkml.org/lkml/2007/9/28/212
>     Subject: [patch] x86: improved memory barrier implementation
> 
> Links to documents:
> 
>     http://developer.intel.com/products/processor/manuals/318147.pdf
>     
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf
> 
> The Intel link doesn't work any more, but the AMD one does.

It might have been merged into their development manual now.

 
> Nick asserts "both manufacturers are committed to in-order loads from
> cacheable memory for the x86 architecture".

At the time we did ask Intel and AMD engineers. We talked with Andy
Glew from Intel I believe, but I can't recall the AMD contact.
Linus was involved in the discussions as well. We tried to do the
right thing with this.

> I have just read the AMD document, and it is in there (but not
> completely obviously), in section 7.2.  The implicit load-load and
> store-store barriers are only guaranteed for "normal cacheable
> accesses on naturally aligned boundaries to WB [write-back cacheable]
> memory".  There are also implicit load-store barriers but not
> store-load.
> 
> Note that the document covers AMD64; it does not say anything about
> their (now old) 32-bit processors.

Hmm. Well it couldn't hurt to ask again. We've never seen any
problems yet, so I'm rather sure we're in the clear.

> 
> > [*] The most recent docs I have handy. Up to and including Core-2 Duo.
> 
> Are you sure the read ordering applies to 32-bit Intel and AMD CPUs too?
> 
> Many years ago, before 64-bit x86 existed, I recall discussions on
> LKML where it was made clear that stores were performed in program
> order.  If it were known at the time that loads were performed in
> program order on 32-bit x86s, I would have expected that to have been
> mentioned by someone.

The way it was explained to us by the Intel engineer is that they
had implemented only visibly in-order loads, but they wanted to keep
their options open in future so they did not want to commit to in
order loads as an ISA feature.

So when the whitepaper was released we got their blessing to
retroactively apply the rules to previous CPUs.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device, (continued)

Prev by Date: Re: [Qemu-devel] Regression: more 0.12 regression (SeaBIOS related?)
Next by Date: Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device
Previous by thread: Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device
Next by thread: Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device
Index(es):
- Date
- Thread