lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/9] mips: Fill delay slots v3


From: Paulo César Pereira de Andrade
Subject: Re: [PATCH v2 0/9] mips: Fill delay slots v3
Date: Fri, 20 Jan 2023 15:18:13 -0300

  Hi Paul,

[snip]
> > > >   Could save the address of the last sequence of opcodes to
> > > > implement a lightning instruction and do a simple disassembly
> > > > and check if there are jumps inside that chunk, or to the next
> > > > instruction after the chunk. Could be a bit more complex in
> > > > code, but smaller in textual C code, as it would be probably
> > > > be no longer than the switch for prev->code. Can look at _patch_at
> > > > to see how it checks for jumps and the displacement.
> > >
> > >   I have a new regression.
> > >   check/tramp.tst with the patch, and manually adding Lighting
> > > instructions
> > > generates at start:
> > >
> > >     jit_jmpi(main)
> > > 0x3520968000:    j    0x3520968118
> > > 0x3520968004:    sltiu    at,s0,2
> > >    jit_tramp(64)
> > > 0x3520968008:    bnez    at,0x3520968110
> > > 0x352096800c:    nop
> > >
> > > and without the patch:
> > >
> > >     jit_jmpi(main)
> > > 0x2d884cc000:    j    0x2d884cc12c
> > > 0x2d884cc004:    nop
> > >    jit_tramp(64)
> > > 0x2d884cc008:    sltiu    at,s0,2
> > > 0x2d884cc00c:    bnez    at,0x2d884cc120
> > > 0x2d884cc010:    nop
> > >
> > > jit_tramp() calls jit_prolog(), but it does not generate code, just
> > > assumes a prolog. The patch removes the delay slot of the jump
> > > to the start of the code and moves the first instruction before
> > > the implicit label.
> > >
> > >   Found it working on patchset to use a variable stack framesize
> > > and other optimizations for leaf functions.
> > >
> > >   Temporary fix was to just make can_swap_ds always return 0.
> > >
> > >   Hopefully soon  will try to redesign the logic as described
> > > earlier,
> > > to make a quick disassembly.
> > >
> > >   First patch was:
> > >
> > > -    if (!prev)
> > > +    /* FIXME this fixes tramp.tst and ctramp.c tests */
> > > +    if (1||!prev)
> > >
> > > while writing this email, just validated that this also works:
> > >
> > >      switch (prev->code) {
> > > +   case jit_code_prolog:
> > >      case jit_code_ltr_f:
> > >
> > > or, could also special case like this:
> > >
> > >      switch (prev->code) {
> > > +   case jit_code_prolog: if (jitc->function->assume_frame) return 0;
> > >      case jit_code_ltr_f:
> >
> > Good catch.
> >
> > I think this solution would work best.
>
>   I am afraid I did reply too fast :( and did not do a proper make clean
> before a new set of tests. Still needs the uglier patch :(
>
>   I should push soon changes for mips and you can also check the
> problem.
>
>   for the moment I got it working in NEW_ABI 64 bit, and am finishing
> testing with old abi and 32 bit, both little endian.

  Just pushed 
https://git.savannah.gnu.org/cgit/lightning.git/commit/?id=7e919226dd8c8d6b74bcb9564dc2c6f8454ed2cb
  You can check the problem.
  It probably was not a side effect of the new patches, just that it did
work before due to a different value in the register, but it was in an
undefined behavior state.

  The patch I pushed makes can_swap_ds always return false.

  I will try to work a bit on the logic of the patch as well, as using the
delay slot is a good idea, instead of having the nops.

  BTW, has_delay_slot is buggy. It is just that there is no generation
of the L modifiers (likely) in Lightning, to keep things simpler. For
example, BEQ and BEQL, for BEQL the delay slot is only executed
if the branch is taken.

Thanks,
Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]