[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Add support for a helper with 7 arguments

From: Richard Henderson
Subject: Re: [PATCH] Add support for a helper with 7 arguments
Date: Fri, 7 Feb 2020 15:49:44 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

On 2/7/20 12:43 PM, Taylor Simpson wrote:
>> -----Original Message-----
>> From: Richard Henderson <address@hidden>
>> But I encourage you to re-think your purely mechanical approach to the 
>> hexagon port.  It seems to me that you should be doing much more during
>> the translation phase so that you can minimize the number of helpers that
>> you require.
> There are a couple of things we could do
> - Short term: Add #ifdef's to the generated code so that the helper isn't
>   compiled when there is a fWRAP_<tag> defined.  There are currently ~500
>   instructions where this is the case.


> - Long term: Integrate rev.ng's approach that uses flex/bison to parse the
> semantics and generate TCG code.
There is perhaps an intermediate step that merely special-cases the load/store
insns.  With rare exceptions (hah!) these are the cases that will most often
raise an exception.  Moreover, they are the *only* cases that can raise an
exception without requiring a helper call anyway.

There are a number of cases that I can think of:

          r6 = memb(r1)
          r7 = memb(r2)

        qemu_ld   t0, r1, MO_UB, mmu_idx
        qemu_ld   t1, r2, MO_UB, mmu_idx
        mov       r6, t0
        mov       r7, t1

          r6 = memb(r1)
          memb(r2) = r7

        qemu_ld   t0, r1, MO_UB, mmu_idx
        qemu_st   r7, r2, MO_UB, mmu_idx
        mov       r6, t0

These being the "normal" case wherein the memops are unconditional, and can
simply use a temp for semantics.  Similarly for MEMOP, NV, or SYSTEM insns in

          r6 = memb(r1)
          if (p0) r7 = memb(r7)

        qemu_ld   l0, r1, MO_UB, mmu_idx
        andi      t1, p0, 1
        brcondi   t1, 0, L1
        qemu_ld   r7, r2, MO_UB, mmu_idx
        mov       r6, l0

For a conditional load in slot 0, we can load directly into the final
destination register and skip the temporary.

Because TCG doesn't do global register allocation, any temporary crossing a
basic block boundary gets flushed to stack.  So this avoids sending the r7
value through an unnecessary round trip.

This works because (obviously) nothing can raise an exception after slot0, and
the only thing that comes after is the commit phase.  This can be extended to a
conditional load in slot1, when we notice that the insn in slot0 cannot raise
an exception.

          memb(r1) = r3
          memb(r2) = r4

        call     helper_probe_access, r1, MMU_DATA_STORE, 1
        call     helper_probe_access, r2, MMU_DATA_STORE, 1
        qemu_st  r3, r1, MO_UB, mmu_idx
        qemu_st  r4, r2, MO_UB, mmu_idx

          memb(r1) = r3
          r4 = memb(r2)

        call     helper_probe_access, r1, MMU_DATA_STORE, 1
        call     helper_probe_access, r2, MMU_DATA_LOAD, 1
        qemu_st  r3, r1, MO_UB, mmu_idx
        qemu_ld  r4, r2, MO_UB, mmu_idx

These cases with a store in slot1 are irritating, because I see that (1) all
exceptions must be recognized before anything commits, and (2) slot1 exceptions
must have preference over slot0 exceptions.  But we can probe them easily 

> - Long long term: A much more general approach will be to turn the C
> semantics code into LLVM IR and generate TCG from the IR.
Why would you imagine this to be more interesting than using flex/bison?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]