[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: R: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

From: BALATON Zoltan
Subject: Re: R: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
Date: Wed, 26 Feb 2020 23:51:05 +0100 (CET)
User-agent: Alpine 2.22 (BSF 395 2020-01-19)

On Wed, 26 Feb 2020, Alex Bennée wrote:
That's the wrong way around. We have regression tests for a reason. I'll
happily accept patches to turn on hardfloat for PPC if:

a) they don't cause regressions in our fairly extensive floating point

Where are these tests and how to run them? I'm not aware of such tests so I've only tried running simple guest code to test changes but if there are more extensive FP tests I'd better use those.

b) the PPC maintainers are happy with the new performance profile

The way forward would be to:

1. patch to drop #if defined(TARGET_PPC) || defined(__FAST_MATH__)

This is simple but I've found that while it seems to make some vector instructions faster it also makes most simple FP ops slower because it will go thorugh checking if it can use hardfloat but never can because the fp_status is cleared before every FP op. That's why I've set inexact bit to let it use hardfloat and be able to test if it would work at all. That's all my RFC patch did, I've had a 2nd version trying to avoid slow down with above #if defined() dropped but hardfloat=false so it only uses softfloat as before but it did not worked out too well, some tests said v2 was even slower. Maybe to avoid overhead we should add a variable instead of the QEMU_NO_HARDFLOAT define that can be set during runtime but probably that won't be faster either. Thus it seems there's no way to enable hardfloat for PPC and not have slower performance for most FP ops without also doing some of the other points below (even if it's beneficial for vector ops).

2. audit target/ppc/fpu_helper.c w.r.t chip manual and fix any unneeded
splatting of flags (if any)

This would either need someone who knows PPC FPU or someone who can take the time to learn and go through the code. Not sure I want to volunteer for that. But I think the clearing of the flags is mainly to emulate FI bit which is an non-sticky inexact bit that should show the inexact status of last FP op. (There's another simliar bit for fraction rounded as well but that does not disable hardfloat.) Question is if we really want to accurately emulate these bits? Are there any software we care about relying on these? If we can live with not having correct FI bit emulation (which was the case for a long time until these were fixed) then we could have an easy way to enable hardfloat without more extensive changes. If we want to accurately emulate also these bits then we probably will need changes to softfloat to allow registering FP exception handlers so we don't have to clear and check bits but can get an exception from FPU and then can set those bits but I have no idea how to do that.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]