[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unaligned load/store opcodes
From: |
Paulo César Pereira de Andrade |
Subject: |
Re: Unaligned load/store opcodes |
Date: |
Tue, 28 Mar 2023 07:26:23 -0300 |
Em seg., 27 de mar. de 2023 às 19:17, Paul Cercueil
<paul@crapouillou.net> escreveu:
>
> Hi Paulo,
>
> Le lundi 27 mars 2023 à 12:14 -0300, Paulo César Pereira de Andrade a
> écrit :
> > Em qui., 23 de mar. de 2023 às 13:50, Paulo César Pereira de Andrade
> > <paulo.cesar.pereira.de.andrade@gmail.com> escreveu:
> > >
> > > Em qui., 23 de mar. de 2023 às 08:07, Paul Cercueil
> > > <paul@crapouillou.net> escreveu:
> > > >
> > > > Hi Paulo,
> > >
> > > Hi Paul,
> > >
> > > > I think Lightning would benefit from having support for 16/32/64-
> > > > bit
> > > > I/O to unaligned addresses. That's something I would actually
> > > > use.
> > > >
> > > > Something like:
> > > > ldur_s / ldur_us / ldur_i / ldur_ui / ldur_l
> > > > stur_s / stur_i / stur_l
> > >
> > > These can be added and fallbacks are mostly trivial.
> > >
> > > > I don't think we need ldx/stx variants.
> > >
> > > For completeness, and unless there is an specialized version for
> > > ldx/stx a simple wrapper adding register values is easy.
> >
> > Using named versions would use too many jit_code_t values
> > for a complete set of something that has very few special use cases.
> >
> > > > What do you think?
> > >
> > > Most cpus have some kind of help for unaligned read, or just
> > > transparently allow it, but slower load/store.
> >
> > Maybe we could think of something like:
> >
> > unldr output base size
> > unldi output base size
> > unldr_u output base size
> > unldi_u output base size
> > unsti base output size
> > unstr base output size
> >
> > and could be useful:
> >
> > unldr_f output base
> > unldi_d output base
> > unstr_f base output
> > unsti_d base output
> >
> > The versions with a register base could have an extra immediate
> > offset argument. But for consistency, better to not have this extra
> > immediate.
> >
> > Since only bytes are addressable, size would be in bytes, and would
> > also allow words of 3 bytes, and 5, 6 and 7 bytes for 64 bit.
>
> Do we really need this? ... The unaligned load/store would be useful
> for loading from unaligned fields in a C struct, for instance, but the
> fields themselves are always 1/2/4/8 bytes, so I don't know in which
> case you would need to load 3/5/6/7-byte "words".
I should have written 3/5/6/7-byte integers.
Updating to use a single _f modifier, for the case of 1/2/3/5/6/7-byte floats,
it would be an assertion to use these, but would leave room for it in
a possible future.
The nonstandard integer and float would not be much useful, just that
they would be easy to add to the suggested abi. They might be useful
for some language or abstraction using lightning.
> > The float and double ones are just for convenience, and in most
> > cases
> > are used for a double aligned at 4 bytes boundaries. There are 2 (or
> > other
> > values) byte floats, but these are usually only in software, and
> > would be
> > too much for lightning, which does not have any kind of soft float
> > support.
>
> I know that on MIP32r2 for instance you have LWL/LWR/SWL/SWR for
> unaligned accesses, but I'm not aware of any mechanism to load/store
> floating-point on unaligned boundaries. You can't even load it into a
> GPR, because (at least on MIPS) there would be no way to transfer that
> value into a FPR. So I'd drop the _f/_d variants.
There are MFC1, MTC1, DMFC1 and DMTC. Load in an GPR then
move the bits "as is" in the GPR to/from the FPR. For consistency, This
would also require making public jit_movr_f_w, jit_movr_w_f, and for 32
bit jit_movr_d_ww and jit_movr_ww_d or for 64 bit jot_movr_d_w and
jit_movr_w_d.
> Unrelated, but it's a bit confusing to have "ext" and "extr"
> instructions, could we maybe find a better name?
> "jit_extbr" for "extract bits"
> or "jit_maskr" for "extract mask"
> as two random suggestions.
The "ext" has been renamed to "extr". The most common naming
pattern is "extract" and "deposit" bits. They are also somewhat similar
to sign/zero extend. The most confusing one is the pair "extr_ui r0 r1"
and "extr_u r0 r1 i0 i1". Renaming now the existing ones to sextr or
uextr would be worse. So, it is still an option to rename the ones not
yet available in an official release.
> Cheers,
> -Paul
Thanks,
Paulo