[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/34] tcg, target/ppc vector improvements

From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 00/34] tcg, target/ppc vector improvements
Date: Tue, 18 Dec 2018 08:16:49 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1

On 12/18/18 7:26 AM, Mark Cave-Ayland wrote:
> That seems wrong to me. Given that the ppc_avr_t is a union then I'd expect 
> it to be
> in host order? Certainly in the VMX helper macros I've looked at, the members 
> are set
> directly with no byte swapping.

"Host order"?  For both words of the vector?

That's certainly going to cause problems wrt VSX and FPU registers.  We're
hard-coding that as fpu == vsx.u64[0] (both before and after your patch set).

For vscr, on master we have

void helper_mtvscr(CPUPPCState *env, ppc_avr_t *r)
    env->vscr = r->u32[3];
    env->vscr = r->u32[0];


        if (needs_byteswap) {
            vmxregset->avr[i].u64[0] = bswap64(cpu->env.avr[i].u64[1]);
            vmxregset->avr[i].u64[1] = bswap64(cpu->env.avr[i].u64[0]);
        } else {
            vmxregset->avr[i].u64[0] = cpu->env.avr[i].u64[0];
            vmxregset->avr[i].u64[1] = cpu->env.avr[i].u64[1];
    vmxregset->vscr.u32[3] = cpu_to_dump32(s, cpu->env.vscr);

For helper macros that apply the same operation to all lanes, it doesn't matter
which order in which the lanes are processed, so of course I would expect them
to be processed in host order.

It's cases that do not apply the same operation, such as merges, where the
problems would arise.

There are at least 3 schemes being employed to address this:

#define HI_IDX 0
#define LO_IDX 1
#define AVRB(i) u8[i]
#define AVRW(i) u32[i]
#define HI_IDX 1
#define LO_IDX 0
#define AVRB(i) u8[15-(i)]
#define AVRW(i) u32[3-(i)]

#define EL_IDX(i) (i)
#define EL_IDX(i) (3 - (i))

#define EL_IDX(i) (i)
#define EL_IDX(i) (1 - (i))

        result.u8[i] = a->u8[indexA] ^ b->u8[indexB];
        result.u8[i] = a->u8[15-indexA] ^ b->u8[15-indexB];


reply via email to

[Prev in Thread] Current Thread [Next in Thread]