[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 0/9] mips: Fill delay slots v3
From: |
Paulo César Pereira de Andrade |
Subject: |
Re: [PATCH v2 0/9] mips: Fill delay slots v3 |
Date: |
Fri, 20 Jan 2023 16:02:55 -0300 |
Em sex., 20 de jan. de 2023 às 15:23, Paul Cercueil
<paul@crapouillou.net> escreveu:
>
> Hi Paulo,
>
> Le vendredi 20 janvier 2023 à 15:18 -0300, Paulo César Pereira de
> Andrade a écrit :
> > Hi Paul,
> >
> > [snip]
> > > > > > Could save the address of the last sequence of opcodes to
> > > > > > implement a lightning instruction and do a simple disassembly
> > > > > > and check if there are jumps inside that chunk, or to the
> > > > > > next
> > > > > > instruction after the chunk. Could be a bit more complex in
> > > > > > code, but smaller in textual C code, as it would be probably
> > > > > > be no longer than the switch for prev->code. Can look at
> > > > > > _patch_at
> > > > > > to see how it checks for jumps and the displacement.
> > > > >
> > > > > I have a new regression.
> > > > > check/tramp.tst with the patch, and manually adding Lighting
> > > > > instructions
> > > > > generates at start:
> > > > >
> > > > > jit_jmpi(main)
> > > > > 0x3520968000: j 0x3520968118
> > > > > 0x3520968004: sltiu at,s0,2
> > > > > jit_tramp(64)
> > > > > 0x3520968008: bnez at,0x3520968110
> > > > > 0x352096800c: nop
> > > > >
> > > > > and without the patch:
> > > > >
> > > > > jit_jmpi(main)
> > > > > 0x2d884cc000: j 0x2d884cc12c
> > > > > 0x2d884cc004: nop
> > > > > jit_tramp(64)
> > > > > 0x2d884cc008: sltiu at,s0,2
> > > > > 0x2d884cc00c: bnez at,0x2d884cc120
> > > > > 0x2d884cc010: nop
> > > > >
> > > > > jit_tramp() calls jit_prolog(), but it does not generate code,
> > > > > just
> > > > > assumes a prolog. The patch removes the delay slot of the jump
> > > > > to the start of the code and moves the first instruction before
> > > > > the implicit label.
> > > > >
> > > > > Found it working on patchset to use a variable stack
> > > > > framesize
> > > > > and other optimizations for leaf functions.
> > > > >
> > > > > Temporary fix was to just make can_swap_ds always return 0.
> > > > >
> > > > > Hopefully soon will try to redesign the logic as described
> > > > > earlier,
> > > > > to make a quick disassembly.
> > > > >
> > > > > First patch was:
> > > > >
> > > > > - if (!prev)
> > > > > + /* FIXME this fixes tramp.tst and ctramp.c tests */
> > > > > + if (1||!prev)
> > > > >
> > > > > while writing this email, just validated that this also works:
> > > > >
> > > > > switch (prev->code) {
> > > > > + case jit_code_prolog:
> > > > > case jit_code_ltr_f:
> > > > >
> > > > > or, could also special case like this:
> > > > >
> > > > > switch (prev->code) {
> > > > > + case jit_code_prolog: if (jitc->function->assume_frame)
> > > > > return 0;
> > > > > case jit_code_ltr_f:
> > > >
> > > > Good catch.
> > > >
> > > > I think this solution would work best.
> > >
> > > I am afraid I did reply too fast :( and did not do a proper make
> > > clean
> > > before a new set of tests. Still needs the uglier patch :(
> > >
> > > I should push soon changes for mips and you can also check the
> > > problem.
> > >
> > > for the moment I got it working in NEW_ABI 64 bit, and am
> > > finishing
> > > testing with old abi and 32 bit, both little endian.
> >
> > Just pushed
> > https://git.savannah.gnu.org/cgit/lightning.git/commit/?id=7e919226dd8c8d6b74bcb9564dc2c6f8454ed2cb
> > You can check the problem.
> > It probably was not a side effect of the new patches, just that it
> > did
> > work before due to a different value in the register, but it was in
> > an
> > undefined behavior state.
> >
> > The patch I pushed makes can_swap_ds always return false.
> >
> > I will try to work a bit on the logic of the patch as well, as
> > using the
> > delay slot is a good idea, instead of having the nops.
> >
> > BTW, has_delay_slot is buggy. It is just that there is no
> > generation
> > of the L modifiers (likely) in Lightning, to keep things simpler. For
> > example, BEQ and BEQL, for BEQL the delay slot is only executed
> > if the branch is taken.
>
> Well that's not a bug, if Lightning does not generate these. And it
> will never will, as the branch-likely opcodes are deprecated in
> MIPS32/MIPS64.
The documentation I have available might be outdated :) I just see
the "likely" instructions as MIPS II, and the generic ones as MIPS I.
Either way, it should be available on some systems, but very low
on todo list to use these, that could be generated conditionally, if
known to be available; I would expect the "branch likely" instructions
usage could be beneficial.
> Cheers,
> -Paul
Thanks,
Paulo
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, (continued)
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/14
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/15
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/15
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/18
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/18
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3,
Paulo César Pereira de Andrade <=
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20