lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64


From: Marc Nieper-Wißkirchen
Subject: Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64
Date: Wed, 4 Sep 2019 19:30:49 +0200

Hi Paulo,

Am Mi., 4. Sept. 2019 um 17:40 Uhr schrieb Paulo César Pereira de
Andrade <address@hidden>:
>
> Em qua, 4 de set de 2019 às 11:23, Marc Nieper-Wißkirchen
> <address@hidden> escreveu:
> >
> > Hi Paulo,
>
>   Hi Marc,
>
> > > > To which extent is code like your example code guaranteed to work,
> > > > Paulo? For example, could we mark *all* registers live after "jmpi
> > > > helper"?
> > >
> > >   Lightning might run out of registers, cause an assertion and return
> > > JIT_NOREG. If lightning it compiled with -DNDEBUG the assertion
> > > will become a noop and it will generate invalid code, after several
> > > assertions of an invalid register.
> > >
> > > > On some ports, like x86_64, a jmpi instruction needs a scratch
> > > > register, and when all registers are marked live after jmpi, the only
> > > > way to get a scratch register is to spill one live register before the
> > > > jmpi and reload it afterward.
> > >
> > >   When it cannot spill/reload, it calls jit_get_reg with the
> > > jit_class_nospill flag, and if there are no free registers it fails as
> > > described above.
> >
> > Thank you for this information.
> >
> > Would it be enough to ensure that there is at least one non-live
> > register around instructions with non-local control flow so that this
> > assertion is never triggered? (A code generator using GNU lightning as
> > a backend must never cause this assertion being triggered or it would
> > be buggy.)
>
>   This would require some quite complex code to simulate all conditions
> this can happen. For example, on x86_64 would happen if the jump
> displacement does not fit in 32 bits. Until generating more than 4G
> of jit in a single block this will not be an issue :)
>   It can happen another way in x86_64, if 8 fpr register are live (need
> to use backend specific knowledge, as by default there are 6), and in a
> function that receives 8 fpr arguments and the argument register is still
> live (after getarg_{f|d} it should be dead), and then, do a branch
> comparing to an immediate. It would be very difficult to trigger it,
> basically need to write code specifically for it, to force all registers
> live.
>
>   Other than the larger than 32 bit displacement described above, it
> cannot happen in any port if using only 3 JIT_Rx, 3 JIT_Vx and 6
> JIT_Fx. If using backend extra registers, should write safer code,
> and understand the is the possibility of running out of registers.
>   The condition should be easy to understand as long as building with
> assertions enabled, and then, the code pattern should be easy to
> understand.
>
> > > > Does GNU lightning work this way?
> > > >
> > > > I have another question about your code: In the above example, the
> > > > callee-saved registers JIT_Vx are used to hold parameters for the
> > > > handler subroutine. Is it also possible to declare a caller-saved
> > > > register to use as an argument (in-going and/or out-going) for the
> > > > handler?
> > >
> > >     Not any kind of formal declaration, but it can use it in any way. It
> >
> > Okay, if I understood correctly this means that in a code like the
> > following, the register R0 can be used to safely transfer parameters
> > to and, in case it jumps back, from the handler procedure.
> >
> > jmpi handler
> > live (JIT_R0)
>
>   It should be marked live in the instruction following the label of
> the return address, more like this:
>
>     jmpi handler
> return_address:
>     live %r0

So this does also work when the handler address is absolute (say,
because the handler was generated in a previous run of GNU lightning)?

Thanks,

Marc

>
> and handler jumps back to return_address. Note that by just using
> %r0 it is implictly understood as live. The original example did not
> use %r0 (rax) before qdivr, so it was understood as dead. And was
> also not required because it was used after qdivr, and again
> understood as live. The example was actually an alternative to calling
> a function for common code.
>
>   The live %r0 above will just understand %r0 is live, until its value
> is set/used. There is a distiction from set to used, when used, it
> understands as live from when it was set to the when it was used. Things
> might become complex when there are branches, this why it assumes all
> registers are live, then, follow jumps, and 'declares dead' scratch
> register on function calls, jmpr and jmpi to unknown location.
>   Just setting a register value does not automatically mark it as live,
> but using the value marks it as live. Actually, for 100% safe cases,
> in the simple optimization pass it will optimize two consecutive
> assignments to a register, removing the first (it does not understand
> MMIO or similar, and is not supposed to be used with special purpose
> registers).
>
> > > is just inventing its own ABI. Using JIT_Vx for in/out arguments makes
> > > things simpler because they are not scratch registers, so, they are not
> > > clobbered in function calls. Note that the example is not a function
> > > call, but a jump to a common thunk only accessible with a jump. And
> > > as previously described, this thunk would be better inside a prolog/epilog
> > > pair, because if it needs to spill/reload a temporary, it will again
> > > cause an assertion, due to not having a stack frame.
> >
> > This, I understand and this is also how I have setup my code.
> >
> > Cheers,
> >
> > Marc
>
> Thanks,
> Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]