lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64


From: Paulo César Pereira de Andrade
Subject: Re: [Lightning] jit_qdivr_u trashes JIT_R0 on x86_64
Date: Wed, 4 Sep 2019 14:58:27 -0300

Em qua, 4 de set de 2019 às 14:31, Marc Nieper-Wißkirchen
<address@hidden> escreveu:

  Hi Marc,

> Hi Paulo,
>

> > > > > To which extent is code like your example code guaranteed to work,
> > > > > Paulo? For example, could we mark *all* registers live after "jmpi
> > > > > helper"?
> > > >
> > > >   Lightning might run out of registers, cause an assertion and return
> > > > JIT_NOREG. If lightning it compiled with -DNDEBUG the assertion
> > > > will become a noop and it will generate invalid code, after several
> > > > assertions of an invalid register.
> > > >
> > > > > On some ports, like x86_64, a jmpi instruction needs a scratch
> > > > > register, and when all registers are marked live after jmpi, the only
> > > > > way to get a scratch register is to spill one live register before the
> > > > > jmpi and reload it afterward.
> > > >
> > > >   When it cannot spill/reload, it calls jit_get_reg with the
> > > > jit_class_nospill flag, and if there are no free registers it fails as
> > > > described above.
> > >
> > > Thank you for this information.
> > >
> > > Would it be enough to ensure that there is at least one non-live
> > > register around instructions with non-local control flow so that this
> > > assertion is never triggered? (A code generator using GNU lightning as
> > > a backend must never cause this assertion being triggered or it would
> > > be buggy.)
> >
> >   This would require some quite complex code to simulate all conditions
> > this can happen. For example, on x86_64 would happen if the jump
> > displacement does not fit in 32 bits. Until generating more than 4G
> > of jit in a single block this will not be an issue :)
> >   It can happen another way in x86_64, if 8 fpr register are live (need
> > to use backend specific knowledge, as by default there are 6), and in a
> > function that receives 8 fpr arguments and the argument register is still
> > live (after getarg_{f|d} it should be dead), and then, do a branch
> > comparing to an immediate. It would be very difficult to trigger it,
> > basically need to write code specifically for it, to force all registers
> > live.
> >
> >   Other than the larger than 32 bit displacement described above, it
> > cannot happen in any port if using only 3 JIT_Rx, 3 JIT_Vx and 6
> > JIT_Fx. If using backend extra registers, should write safer code,
> > and understand the is the possibility of running out of registers.
> >   The condition should be easy to understand as long as building with
> > assertions enabled, and then, the code pattern should be easy to
> > understand.
> >
> > > > > Does GNU lightning work this way?
> > > > >
> > > > > I have another question about your code: In the above example, the
> > > > > callee-saved registers JIT_Vx are used to hold parameters for the
> > > > > handler subroutine. Is it also possible to declare a caller-saved
> > > > > register to use as an argument (in-going and/or out-going) for the
> > > > > handler?
> > > >
> > > >     Not any kind of formal declaration, but it can use it in any way. It
> > >
> > > Okay, if I understood correctly this means that in a code like the
> > > following, the register R0 can be used to safely transfer parameters
> > > to and, in case it jumps back, from the handler procedure.
> > >
> > > jmpi handler
> > > live (JIT_R0)
> >
> >   It should be marked live in the instruction following the label of
> > the return address, more like this:
> >
> >     jmpi handler
> > return_address:
> >     live %r0
>
> So this does also work when the handler address is absolute (say,
> because the handler was generated in a previous run of GNU lightning)?

  It should work mostly the same way. But should take special care,
because then, the jmpi will not be followed to resolve live registers,
and at the jmpi point, it will consider non callee save registers dead.
  It should only be required to use 'live' on a context like this:

jmpi handler
return_address:
live %r0
<<< code that does not use %r0 >>>
jmpi another_handler_that_will_use_r0:
another_return_address:

this will prevent %r0 being used as a scratch register, or if used,
saved/restored in the '<<< code that does not use %r0 >>>' part.

> Thanks,
>
> Marc
>
> >
> > and handler jumps back to return_address. Note that by just using
> > %r0 it is implictly understood as live. The original example did not
> > use %r0 (rax) before qdivr, so it was understood as dead. And was
> > also not required because it was used after qdivr, and again
> > understood as live. The example was actually an alternative to calling
> > a function for common code.
> >
> >   The live %r0 above will just understand %r0 is live, until its value
> > is set/used. There is a distiction from set to used, when used, it
> > understands as live from when it was set to the when it was used. Things
> > might become complex when there are branches, this why it assumes all
> > registers are live, then, follow jumps, and 'declares dead' scratch
> > register on function calls, jmpr and jmpi to unknown location.
> >   Just setting a register value does not automatically mark it as live,
> > but using the value marks it as live. Actually, for 100% safe cases,
> > in the simple optimization pass it will optimize two consecutive
> > assignments to a register, removing the first (it does not understand
> > MMIO or similar, and is not supposed to be used with special purpose
> > registers).
> >
> > > > is just inventing its own ABI. Using JIT_Vx for in/out arguments makes
> > > > things simpler because they are not scratch registers, so, they are not
> > > > clobbered in function calls. Note that the example is not a function
> > > > call, but a jump to a common thunk only accessible with a jump. And
> > > > as previously described, this thunk would be better inside a 
> > > > prolog/epilog
> > > > pair, because if it needs to spill/reload a temporary, it will again
> > > > cause an assertion, due to not having a stack frame.
> > >
> > > This, I understand and this is also how I have setup my code.

Thanks,
Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]