qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] target/arm: Fixup SIMD fcmla(by element) in 4H arrangeme


From: Richard Henderson
Subject: Re: [PATCH 3/4] target/arm: Fixup SIMD fcmla(by element) in 4H arrangement
Date: Tue, 8 Dec 2020 15:04:16 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 12/6/20 10:46 PM, LIU Zhiwei wrote:
> For SIMD fcmla(by element), if the number of elements is less than
> the number of elements within one segment,i.e. 4H arrangement,
> we should not calculate the entire segment.
> 
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
>  target/arm/vec_helper.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
> index 7174030377..44b8165323 100644
> --- a/target/arm/vec_helper.c
> +++ b/target/arm/vec_helper.c
> @@ -544,6 +544,10 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void 
> *vm,
>      neg_real <<= 15;
>      neg_imag <<= 15;
>  
> +    /* Adjust eltspersegment for simd 4H */
> +    if (eltspersegment > elements) {
> +        eltspersegment = elements;
> +    }

Ok.  Maybe better to fold this back to the initialization using MIN.

>      for (i = 0; i < elements; i += eltspersegment) {
>          float16 mr = m[H2(i + 2 * index + 0)];
>          float16 mi = m[H2(i + 2 * index + 1)];
> @@ -610,6 +614,10 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void 
> *vm,
>      neg_real <<= 31;
>      neg_imag <<= 31;
>  
> +    /* Adjust eltspersegment for simd 4H */
> +    if (eltspersegment > elements) {
> +        eltspersegment = elements;
> +    }

Incorrect: this function only computes 4S.

>      for (i = 0; i < elements; i += eltspersegment) {
>          float32 mr = m[H4(i + 2 * index + 0)];
>          float32 mi = m[H4(i + 2 * index + 1)];
> 


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]