[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Rework of x86 function prolog and epilog
From: |
Paul Cercueil |
Subject: |
Re: Rework of x86 function prolog and epilog |
Date: |
Fri, 20 Jan 2023 16:58:55 +0000 |
Hi Paulo,
Le mercredi 18 janvier 2023 à 13:40 -0300, Paulo César Pereira de
Andrade a écrit :
> Hi,
>
> A rework of functions prolog and epilog has been done for x86.
>
> Most notable changes are:
>
> o No fixed stack_framesize. This was from original lightning. Now it
> generates IR using a fixed stack_framesize but patches the offsets
> before code generation to reduce stack usage.
> It does so to avoid allow using an integer value to be both, a
> register
> number of an offset. If the offset is too large it knows it is a
> stack offset.
> o In some cases it does not set the %rbp, only %rsp, as it saves and
> restores registers in prolog/epilog displaced from %rsp, and if
> there
> was no alloca used, %rbp is not really required.
> Unfortunately Lightning does not know about function attributes,
> so,
> if any non jit function is called, it still sets the %rbp value.
> make check pass all tests but catomic, if not setting %rbp. It
> crashes
> in pthread_create. Not certain about the reason, but for now it
> sets
> %rbp (creates a frame pointer) to avoid possible issues with unwind
> or related usage from a called function.
> o The stack offset (CVT_OFFSET) to move from x87 to/from sse, or gpr
> to/from
> fpr is now dynamically allocated. It is very rarely used, and
> having
> it a fixed stack offset, even if not used, would mostly defeat this
> patch due to an implicit alloca in every function generated.
>
> Please let me know if you have any issues. It does not have issues
> in
> all my usages, and passes "make check" in all x86 variants: Linux and
> Windows x86_64, Linux x32, Linux and Windows 32 bit. But it is
> possible
> that some usage has been broken. I can only think about code knowing
> about Lightning internals and offsets of registers in the frame
> pointer;
> if that is the case, using jit_frame() and jit_tramp() should provide
> the previous behavior.
>
> The big advantage of this change is to make it cheaper to translate
> some custom language function to a jit function, as the prolog/epilog
> cost is drastically reduced now.
Seems to work fine here as well.
Cheers,
-Paul