|Subject:||[Lightning] Bug in x86_64|
|Date:||Thu, 12 Jun 2008 13:40:42 -0400|
I am porting on the 64bit jit now. And I did stumle on a bug.
I have a code sequence as follows:
jit_prolog(2); // tell you have two args
int ofs1 = jit_arg_p();
int ofs2 = jit_arg_p();
jit_getarg_p(CJIT_V1, ofs1); // V1 <- ColAllocator* (arg0)
jit_getarg_p(CJIT_V2, ofs2); // V2 <- ColEnvI* (arg1)
Which works smoothing in 32-bit. As you guessed this is the start of a function that takes two pointers as arguments.
Now, on x86_64, the arguments are passed in registers. Specifically
A0 -> RDI
A1 -> RSI
A2 -> RDX
A3 -> RCX
A4 -> R8
A5 -> R9
The code calling the above is a fragment written in C++. I checked its disassembly and it indeed conforms to this protocol. By the time we call, the two pointers (say A,B) are in RDI and RSI
Now here is the intel assembly emitted for the above by lightning:
0x1000362c45: push %rbx
0x1000362c46: push %r12
0x1000362c48: push %r13
0x1000362c4a: push %rbp
0x1000362c4b: mov %rsp,%rbp
0x1000362c4e: mov %rdi,%rsi
0x1000362c51: mov %rsi,%rdi
The first 4 push operations are part of the callee-saved logic. All good. The instruction
makes sense as well. It sets up the frame pointer.
Now the next two essentially correspond to the two getarg_p macros.
Since V1 is mapped to RSI and V2 is mapped to RDI, the first instruction picks up the first argument (say A, currently held in RDI, the register for A0) and moves it to V1 ... which is RSI.
Boom: I just disintegrated A1
The next instruction moves what should have been A1 (but is a copy of A0's content) into V2, aka RDI
So now A0 == A1, not the intended effect ;-)
Unfortunately, I don't know the 64bit port of lightning well enough to fix it directly. It seems that reusing RSI/RDI that are registers holding Arg1 / Arg0 for the GPR of lightning may be a bad idea. Maybe using two from the range R8..R11 (the caller saved ones) would be better? [I do not know whether lightning uses all the available registers or not] a move from an input register to anything that lightning supports (i.e., R0..R2 or V0..V2) is going to be incorrect when the target is also an input register (i.e., R1,R2,V1,V2).
What are your thoughts?
Description: S/MIME cryptographic signature
|[Prev in Thread]||Current Thread||[Next in Thread]|