qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses


From: Maciej W. Rozycki
Subject: Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
Date: Wed, 13 May 2015 23:54:47 +0100 (BST)
User-agent: Alpine 2.11 (LFD 23 2013-08-11)

On Wed, 13 May 2015, Richard Henderson wrote:

> >> I believe the problem is that MSA vector register's size is 16-bytes
> >> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
> >> is supposed to be atomic.
> > 
> >  Not really AFAICT.  Here's what the specification says[1]:
> > 
> > "The vector load instruction is atomic at the element level with no 
> > guaranteed ordering among elements, i.e. each element load is an atomic 
> > operation issued in no particular order with respect to the element's 
> > vector position."
> > 
> > and[2]:
> > 
> > "The vector store instruction is atomic at the element level with no 
> > guaranteed ordering among elements, i.e. each element store is an atomic 
> > operation issued in no particular order with respect to the element's 
> > vector position."
> > 
> > so you only need to get atomic up to 8 bytes (with LD.D and ST.D, less 
> > with the narrower vector elements), and that looks supported to me.
> 
> There's "atomic" in the transactional sense, and then there's "atomic" in the
> visibility to other actors on the bus sense.
> 
> Presumably Leon is talking about the first, wherein we must ensure all writes
> to both pages must succeed.  Which just means making sure that both pages are
> present and writable before modifying any memory.

 I don't think we have.  The specification is a bit unclear I must admit 
and it also defines the details of vector load and store operations as 
implementation dependent, so there's no further clarification.

 However any unaligned loads or stores that cross a data-bus-width 
boundary require two bus cycles to complete and therefore by definition 
are not atomic in the visibility to other actors on the bus sense.  
Therefore the only atomicity sense that can be considered here is I 
believe transactional, on the per-element basis as this is what the 
specification refers to.

 Then the exact semantics of loads and stores is left up to the 
implementer, so for example ST.H can be implemented as 2 
doubleword-store transactions, or 4 word-store transactions (that 
wouldn't be allowed with ST.D), or 8 halfword-store transactions (that 
wouldn't be allowed with ST.W), but not 16 byte-store transactions (that 
would be allowed with ST.B).

 Consequently I believe only individual vector element writes (or reads, 
for that matter) are required to either successfully complete or 
completely back out, and a TLB, an address error or a bus error 
exception (or perhaps a hardware interrupt exception even) happening in 
the middle of a vector load or store instruction may observe the 
destination vector register or memory respectively partially updated 
with elements already transferred (but not an individual element 
partially transferred).

 That would be consistent with what happens with the other multi-word 
transfer instructions I mentioned when they get interrupted on the way 
(yes, they do allow hardware interrupts to break them too) and likely 
easier to implement as well.

 That's just my intepretation though.  Perhaps the specification needs a 
further clarification.

  Maciej



reply via email to

[Prev in Thread] Current Thread [Next in Thread]