[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from
From: |
Mark Cave-Ayland |
Subject: |
Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c |
Date: |
Sun, 5 May 2019 10:27:02 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 |
On 30/04/2019 17:25, Richard Henderson wrote:
> On 4/28/19 7:38 AM, Mark Cave-Ayland wrote:
>> void helper_xsaddqp(CPUPPCState *env, uint32_t opcode)
>> {
>> - ppc_vsr_t xt, xa, xb;
>> + ppc_vsr_t *xt = &env->vsr[rD(opcode) + 32];
>> + ppc_vsr_t *xa = &env->vsr[rA(opcode) + 32];
>> + ppc_vsr_t *xb = &env->vsr[rB(opcode) + 32];
>> float_status tstat;
>>
>> - getVSR(rA(opcode) + 32, &xa, env);
>> - getVSR(rB(opcode) + 32, &xb, env);
>> - getVSR(rD(opcode) + 32, &xt, env);
>> helper_reset_fpstatus(env);
>>
>> tstat = env->fp_status;
>> @@ -1860,18 +1857,17 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t
>> opcode)
>> }
>>
>> set_float_exception_flags(0, &tstat);
>> - xt.f128 = float128_add(xa.f128, xb.f128, &tstat);
>> + xt->f128 = float128_add(xa->f128, xb->f128, &tstat);
>> env->fp_status.float_exception_flags |= tstat.float_exception_flags;
>>
>> if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
>> float_invalid_op_addsub(env, 1, GETPC(),
>> - float128_classify(xa.f128) |
>> - float128_classify(xb.f128));
>> + float128_classify(xa->f128) |
>> + float128_classify(xb->f128));
>
> These values are no longer valid, because you may have written over them with
> the store to xt->f128. You need to keep the result in a local variable until
> the location of the putVSR in order to keep the current semantics.
>
> (Although the current semantics probably need to be reviewed with respect to
> how the exception is signaled vs the result is stored to the register file. I
> know there are current bugs in this area with respect to regular
> floating-point
> operations, never mind the vector floating-point ones.)
>
>> #define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)
>> \
>> void helper_##name(CPUPPCState *env, uint32_t opcode)
>> \
>> {
>> \
>> - ppc_vsr_t xt, xa, xb;
>> \
>> + ppc_vsr_t *xt = &env->vsr[xT(opcode)];
>> \
>> + ppc_vsr_t *xa = &env->vsr[xA(opcode)];
>> \
>> + ppc_vsr_t *xb = &env->vsr[xB(opcode)];
>> \
>> int i;
>> \
>>
>> \
>> - getVSR(xA(opcode), &xa, env);
>> \
>> - getVSR(xB(opcode), &xb, env);
>> \
>> - getVSR(xT(opcode), &xt, env);
>> \
>> helper_reset_fpstatus(env);
>> \
>>
>> \
>> for (i = 0; i < nels; i++) {
>> \
>> float_status tstat = env->fp_status;
>> \
>> set_float_exception_flags(0, &tstat);
>> \
>> - xt.fld = tp##_##op(xa.fld, xb.fld, &tstat);
>> \
>> + xt->fld = tp##_##op(xa->fld, xb->fld, &tstat);
>> \
>> env->fp_status.float_exception_flags |=
>> tstat.float_exception_flags; \
>>
>> \
>> if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
>> \
>> float_invalid_op_addsub(env, sfprf, GETPC(),
>> \
>> - tp##_classify(xa.fld) |
>> \
>> - tp##_classify(xb.fld));
>> \
>> + tp##_classify(xa->fld) |
>> \
>> + tp##_classify(xb->fld));
>> \
>> }
>> \
>
> Similarly. Only here it's more interesting in that element 0 is modified when
> element 3 raises an exception. To keep current semantics you need to keep xt
> as a ppc_vsr_t local variable and write back at the end.
>
> It looks like the same is true for every other function.
Meh, so I forgot about the case where src == dest and obviously it's not
something
that gets tickled by my test images :(
I've spent a bit of time today going through the functions and it seems that all
functions which have an xt parameter, minus a couple of the TEST macros,
require the
result to be calculated in a local variable first.
I think the best solution is still to remove getVSR()/putVSR() but replace them
with
macros for copying and zeroing like this:
#define VSRCPY(d, s) (memcpy(d, s, sizeof(ppc_vsr_t)))
#define VSRZERO(d) (memset(d, 0, sizeof(ppc_vsr_t)))
Even though we're still doing a copy of the result register I hope that not
having to
copy the 2 source registers, plus replacing the copies with a straight memcpy()
will
still be an advantage. Does that seem sensible to you?
FWIW I agree that the exception handling seems inconsistent around the
functions: for
example it looks strange that the VSX_FMADD code blindly copies the result back
even
when an exception occurred.
ATB,
Mark.
- Re: [Qemu-devel] [PATCH 01/14] target/ppc: remove getVSR()/putVSR() from fpu_helper.c,
Mark Cave-Ayland <=