Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics

From:	Aurelien Jarno
Subject:	Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
Date:	Sun, 16 Jul 2017 17:18:02 +0200
User-agent:	NeoMutt/20170113 (1.7.2)

On 2017-07-15 16:30, Richard Henderson wrote:
> On 07/15/2017 12:14 PM, Aurelien Jarno wrote:
> > On 2017-07-06 16:20, Richard Henderson wrote:
> > > For uniprocessors, SH4 uses optimistic restartable atomic sequences.
> > > Upon an interrupt, a real kernel would simply notice magic values in
> > > the registers and reset the PC to the start of the sequence.
> > > 
> > > For QEMU, we cannot do this in quite the same way.  Instead, we notice
> > > the normal start of such a sequence (mov #-x,r15), and start a new TB
> > > that can be executed under cpu_exec_step_atomic.
> > 
> > Do you have actually have a good documentation about gUSA? I have found
> > a few documents (some of them in Japanese), the most complete one being
> > the LinuxTag paper. The ABI is also described in the kernel and the
> > glibc. That said I am missing the following informations:
> 
> Kernel sources and glibc are good.  The description in GCC is pretty good as 
> well:
> 
> https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/config/sh/sync.md?revision=243994&view=markup#l53

Thanks for the pointer.

> > - What kind of instructions are allowed in the atomic sequence? Your
> >    patch takes into account branches, but are there allowed? used in
> >    practice? What about FP instructions?
> 
> Any sequence that can be restarted.

Ok. So I guess it means a sequence with branch taken or not taken is
possible. The same way I guess that another instruction than a mov to
set a negative stack pointer and start the sequence is also possible,
also not emitted by glibc or gcc.

Therefore we should consider this patchset to cover 99% of the
most common case. Of course the 1% remaining cases are not handled as
atomic, but besides that kept being correctly emulated.

> > - Does the atomic sequence is actually allowed to cross pages?
> 
> I don't see why not.  It would be restarted on a page fault, but assuming
> that a process can have 2 pages in ram doesn't seem onerous.

Ok. Same comment as above.

> > - Is there any alignement required? The paper mention adding a nop to
> >    gUSA_exchange_and_add to align the end point to 4 bytes.
> 
> That's about the mov.l instruction that computes the address.

I guess you mean mova. That make senses, and actually doesn't impact
QEMU.
 
> > This looks fine. I guess this can be improved in a another patch by
> > changing the caller to pass a single target address and if the condition
> > is true or false. This would avoid having to detect that here, and would
> > not make the code below more complicated.
> 
> Sure.
> 
> > > +    if (ctx->tbflags & GUSA_EXCLUSIVE) {
> > > +        /* Regardless of single-stepping or the end of the page,
> > > +           we must complete execution of the gUSA region while
> > > +           holding the exclusive lock.  */
> > > +        *pmax_insns = max_insns;
> > > +        return 0;
> > >       }
> > 
> > What are the consequence of not stopping the translation when crossing
> > the page in user mode? If it doesn't have any, we should probably change
> > the code to never stop when crossing pages.
> 
> The only consequence is recognizing a segv "too early".  It's a quirk that,
> frankly, I'm willing to accept.

Ok.

> > > +    if (ctx.tbflags & GUSA_EXCLUSIVE) {
> > > +        /* Ending the region of exclusivity.  Clear the bits.  */
> > > +        ctx.envflags &= ~GUSA_MASK;
> > > +    }
> > > +
> > 
> > IIUC this assumes the number of instructions in the sequence is always
> > executed. I guess this is not correct if the TCG op buffer is full. Some
> > non-privileged instructions might also stop the translation, but they
> > are all FPU instructions, so I guess the section is simply not valid in
> > that case.
> 
> The TCG op buffer is always empty when we begin the sequence.  Since we're
> limited to 128 bytes, or 64 instructions, I'm not concerned about running
> out. I attempt to make sure that all of the paths out -- like exceptions and
> branches -- that don't do what we want have GUSA state zapped.
> 
> At which point ya gets what ya gets.  Not atomic, but not really wrong either.
> 

Thanks for all the details, things are much clearer now. Therefore:

Reviewed-by: Aurelien Jarno <address@hidden>


That said for further improvements did you consider decoding the gUSA
section in a helper. It might avoid having to emulate the atomic
sequence with 3 TBs in the worst case (the original one, the one to
decode the sequence and the one holding the exclusive lock). The helper
should directly have access to the r0 value, can decode the atomic 
sequence and translate it into a call to the corresponding atomic
helpers. In the best case that means the sequence can be done in the
same TB. In the worst case, where the gUSA sequence is not recognized
it means two TB.

I am might have forgotten something though.

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
address@hidden                 http://www.aurel32.net

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean, (continued)
- [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean, Richard Henderson, 2017/07/06
  - Re: [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean, Aurelien Jarno, 2017/07/07
  - Re: [Qemu-devel] [PATCH v2 04/27] target/sh4: Keep env->flags clean, Philippe Mathieu-Daudé, 2017/07/08
- [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA, Richard Henderson, 2017/07/06
  - Re: [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA, Aurelien Jarno, 2017/07/07
  - Re: [Qemu-devel] [PATCH v2 05/27] target/sh4: Adjust TB_FLAG_PENDING_MOVCA, Philippe Mathieu-Daudé, 2017/07/08
- [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Richard Henderson, 2017/07/06
  - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Aurelien Jarno, 2017/07/15
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, John Paul Adrian Glaubitz, 2017/07/15
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Richard Henderson, 2017/07/15
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Aurelien Jarno <=
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Richard Henderson, 2017/07/16
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Aurelien Jarno, 2017/07/16
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Richard Henderson, 2017/07/16
    - Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics, Aurelien Jarno, 2017/07/16
- [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences, Richard Henderson, 2017/07/06
  - Re: [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences, Aurelien Jarno, 2017/07/17
- [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery, Richard Henderson, 2017/07/06
  - Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery, John Paul Adrian Glaubitz, 2017/07/07
    - Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery, Richard Henderson, 2017/07/07
    - Re: [Qemu-devel] [PATCH v2 08/27] linux-user/sh4: Notice gUSA regions during signal delivery, John Paul Adrian Glaubitz, 2017/07/07

Prev by Date: Re: [Qemu-devel] [PATCH 2/3] test-qdev-global-props: Test global property ordering
Next by Date: Re: [Qemu-devel] [PATCH v2 09/27] linux-user/sh4: Clean env->flags on signal boundaries
Previous by thread: Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
Next by thread: Re: [Qemu-devel] [PATCH v2 06/27] target/sh4: Handle user-space atomics
Index(es):
- Date
- Thread