[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 0/9] mips: Fill delay slots v3
From: |
Paulo César Pereira de Andrade |
Subject: |
Re: [PATCH v2 0/9] mips: Fill delay slots v3 |
Date: |
Wed, 18 Jan 2023 10:27:45 -0300 |
Em qua., 18 de jan. de 2023 às 09:42, Paul Cercueil
<paul@crapouillou.net> escreveu:
>
> Hi Paulo,
Hi Paul,
> Le dimanche 15 janvier 2023 à 09:54 -0300, Paulo César Pereira de
> Andrade a écrit :
> > Em dom., 15 de jan. de 2023 às 08:05, Paul Cercueil
> > <paul@crapouillou.net> escreveu:
> > >
> > > Hi Paulo,
> >
> > Hi Paul,
> >
> > > Le samedi 14 janvier 2023 à 22:42 -0300, Paulo César Pereira de
> > > Andrade
> > > a écrit :
> > > > Em sáb., 14 de jan. de 2023 às 12:11, Paul Cercueil
> > > > <paul@crapouillou.net> escreveu:
> > > > >
> > > > > Hi Paulo,
> > > >
> > > > Hi Paul,
> > > >
> > > > Patches pushed as all tests pass.
> > >
> > > Great! ;)
> > >
> > > > > Here's the V3 of my patchset that attempts to fill branch delay
> > > > > slots on
> > > > > MIPS.
> > > > >
> > > > > All tests do pass now, except the "float.nodata" check, but
> > > > > this
> > > > > one
> > > > > fails in master as well.
> > > >
> > > > What environment are you using for testing? It works in the
> > > > mips
> > > > environments I am testing.
> > > >
> > > > Please let me know the output of:
> > > >
> > > > $ cpp -dM </dev/null
> > >
> > > Just a mips64 toolchain built using buildroot.
> > >
> > > I believe you meant
> > > cpp -dM -E - < /dev/null
> >
> > It is a very recent gcc, and probably kernel. My guess is that the
> > issue
> > is caused by some kind of unaligned access. Is it real hardware or
> > some
> > emulator?
> >
> > Can you please run it as:
> >
> > $ cd check
> > $ make debug
> > ...
> > (gdb) r -d float.tst
> >
> > and see what if it is just somehow creating a bad immediate or if
> > it is triggering some kind of fault, like misaligned or invalid
> > instruction?
>
> This was under Qemu with my MIPS64 toolchain.
>
> However today as I try to reproduce it, all the checks do pass. Maybe a
> system update on my PC brought a fix (to qemu), or maybe it does happen
> only under some conditions.
>
> I'll debug it if I encounter it again.
>
> > > See attachment.
> > >
> > > > > I implemented your suggestion and it worked just fine. There
> > > > > was
> > > > > one
> > > > > last problem though, which caused the "clobber" check to fail
> > > > > once
> > > > > again; code emitters that generated branches to the next
> > > > > instruction (to conditionally execute some opcodes) caused
> > > > > problems, as
> > > > > the next instruction was not detected as a jump target (no
> > > > > "jit_flag_patch" set). I solved this by just checking the
> > > > > previous
> > > > > node's code against a table of accepted values.
> > > >
> > > > Not certain if I fully understand this. It also appears to be a
> > > > somewhat
> > > > fragile logic.
> > > >
> > > > Everytime the delay slot is used, there should be a comment
> > > > telling that instruction is in the delay slot, for example:
> > > >
> > > > static void
> > > > _ltr_f(jit_state_t *_jit, jit_int32_t r0, jit_int32_t r1,
> > > > jit_int32_t
> > > > r2)
> > > > {
> > > > jit_word_t w;
> > > > C_OLT_S(r1, r2);
> > > > w = _jit->pc.w;
> > > > BC1T(0);
> > > > /* delay slot */
> > > > movi(r0, 1);
> > > > movi(r0, 0);
> > > > patch_at(w, _jit->pc.w);
> > > > }
> > >
> > > There is a comment indeed, but the comment is gone as soon as you
> > > compile the code.
> > >
> > > If you have a 'ltr_f' followed by a 'beqr', the code will try to
> > > find a
> > > delay slot for the 'beqr', and will detect that the 'movi(r0, 0)'
> > > (aka.
> > > the last opcode generated by 'ltr_f') can safely be moved, because
> > > there is nothing that can tell us that the last movi is a jump
> > > target.
> > >
> > > So I just blacklist 'ltr_f' as well as others so that the emitted
> > > code
> > > will always be correct.
> >
> > I understand now. The problem I see is that it is not very clear
> > from
> > the patch, and/or changes elsewhere could cause it to no longer be
> > required, or have some missing check if new lightning instructions
> > are added.
> >
> > The problem is that can_swap_ds() does not check the pattern of a
> > jump in 2 (for fpr comparison) or 3 (for compare-and-swap)
> > instructions
> > before the last machine instruction that it attempts to move.
>
> Yes, it only checks the last instruction. Going further than that would
> increase the complexity of the code, I think.
Could save the address of the last sequence of opcodes to
implement a lightning instruction and do a simple disassembly
and check if there are jumps inside that chunk, or to the next
instruction after the chunk. Could be a bit more complex in
code, but smaller in textual C code, as it would be probably
be no longer than the switch for prev->code. Can look at _patch_at
to see how it checks for jumps and the displacement.
> > Currently there isn't a clear way to pass parameters to jit
> > generation.
> > The closest is the jit_cpu_t for arm and x86. Maybe we could add a
> > jit_cpu_t for mips, and have an option to make the swap with delay
> > slot optional? This would make it easy to validate that if a bug is
> > found
> > in the future, it can be easily verified that it is not a side effect
> > of the
> > changes in these patches. That is, have a defined jit_cpu_t for mips
> > in include/lightning/jit_mips.h and then in
> > lib/jit_mips.c:jit_get_cpu()
> > set it to a default value, and use the flag allowing or disallowing
> > the
> > patch during jit generation.
>
> That should be doable, yes. But what would set the value of the flag?
> Would it be a ./configure option?
Just declare a jit_cpu_t like done in include/lightning/jit_x86.h
and include/lightning/jit_arm.h
It would have a single bit flag, that would be checked, for example:
static jit_bool_t _can_swap_ds(jit_state_t *_jit, jit_node_t *prev,
...
- if (!prev)
+ if (!prev || !jit_cpu.can_swap_ds)
...
can choose another name for the flag.
The usage of jit_cpu_t is not documented, and should really be
used for cpu features, but is already used for testing, for example,
disabling jit_cpu.sse2. It is only checked for i686, and dynamically
switches to use x87 registers instead of xmm registers.
> > > > so, you check the previous opcode for float comparison and
> > > > cas{r,i}.
> > > > Shouldn't it also check for b{o,x}{add,sub}{r,i} codes? These
> > > > also
> > > > end with a delay slot usage, and based on the patches and your
> > > > description, not checking them might create very difficult to
> > > > debug
> > > > bugs in jit doing complex branch logic on carry/overflow.
> > >
> > > The b{o,x}{add,sub}{r,i} end up with a delay slot, and that's
> > > perfectly
> > > fine. The algorithm will detect that and won't try to move the
> > > opcode
> > > (that's the has_delay_slot() check in can_swap_ds()).
> > >
> > > Cheers,
> > > -Paul
> >
> > Thanks,
> > Paulo
>
> Cheers,
> -Paul
These are just suggestions. For the moment it should be good
as is. Checking if any issue is related to this patch would be just
rebuilding and always returning zero from can_swap_ds().
Thanks,
Paulo
- [PATCH v3 3/8] mips: Fill delay slots of JALR opcodes in jit_callr, (continued)
- [PATCH v3 3/8] mips: Fill delay slots of JALR opcodes in jit_callr, Paul Cercueil, 2023/01/14
- [PATCH v3 4/8] mips: Fill delay slots of J in jit_jmpi, Paul Cercueil, 2023/01/14
- [PATCH v3 5/8] mips: Fill delay slots in jit_beqr / jit_beqi, Paul Cercueil, 2023/01/14
- [PATCH v3 6/8] mips: Fill delay slots in jit_bner / jit_bnei, Paul Cercueil, 2023/01/14
- [PATCH v3 7/8] mips: Fill delay slots in jit_bgtr, jit_bgti, jit_bler, jit_blei, Paul Cercueil, 2023/01/14
- [PATCH v3 8/8] mips: Fill delay slots in jit_bger, jit_bgei, jit_bltr, jit_blti, Paul Cercueil, 2023/01/14
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/14
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/15
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/15
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/18
- Re: [PATCH v2 0/9] mips: Fill delay slots v3,
Paulo César Pereira de Andrade <=
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20