[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

From: Alex Bennée
Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
Date: Wed, 26 Feb 2020 12:28:29 +0000
User-agent: mu4e 1.3.8; emacs 27.0.60

BALATON Zoltan <address@hidden> writes:

> On Fri, 21 Feb 2020, Peter Maydell wrote:
>> On Fri, 21 Feb 2020 at 18:04, BALATON Zoltan <address@hidden> wrote:
>>> On Fri, 21 Feb 2020, Peter Maydell wrote:
>>>> I think that is the wrong approach. Enabling use of the host
>>>> FPU should not affect the accuracy of the emulation, which
>>>> should remain bitwise-correct. We should only be using the
>>>> host FPU to the extent that we can do that without discarding
>>>> accuracy. As far as I'm aware that's how the hardfloat support
>>>> for other guest CPUs that use it works.

Correct - we only use hardfloat when we know it will give the same
result which is broadly when the inexact flag is already set and we are
dealing with normal floating point numbers.

We added a whole bunch of testing to ensure we maintain accuracy when
the code went in.

>> I don't know much about PPC, but if you can't emulate the
>> guest architecture accurately with the host FPU, then
>> don't use the host FPU. We used to have a kind of 'hardfloat'
> I don't know if it's possible or not to emulate these accurately and
> use the FPU but nobody did it for QEMU so far. But if someone knows a
> way please speak up then we can try to implement it. Unfortunately
> this would require more detailed knowledge about different FPU
> implementations (at least X86_64, ARM and PPC that are the mostly used
> platforms) than what I have or willing to spend time to learn.
>> support that was fast but inaccurate, but it was a mess
>> because it meant that most guest code sort of worked but
>> some guest code would confusingly misbehave. Deliberately
>> not correctly emulating the guest CPU/FPU behaviour is not
>> something I want us to return to.
>> You're right that sometimes you can't get both speed
>> and accuracy; other emulators (and especially ones
>> which are trying to emulate games consoles) may choose
>> to prefer speed over accuracy. For QEMU we prefer to
>> choose accuracy over speed in this area.
> OK, then how about keeping the default accurate but allow to opt in to
> use FPU even if it's known to break some bits for workloads where
> users would need speed over accuracy and would be happy to live with
> the limitation.

About the only comparison I can think of is the thread=single:multi
flags for TCG which is mostly there to help developers eliminate causes
of bugs. The default for MTTCG is it is enabled when it's safe. If you
enable it via the command line where QEMU hasn't defaulted it on you
will get lots of loud warnings about potential instability. The most
commonly used case is thread=single when you want to check it's not a
MTTCG bug.

I'm as cautious as Peter here about adding a "faster but broken" command
line flag because users will invariably read up to the "faster" and then
spend a lot of time scratching their heads when things break.

> Note that i've found that just removing the define
> that disables hardfloat for PPC target makes VMX vector instructions
> faster while normal FPU is a little slower without any other changes
> so disabling hardfloat already limits performance for guests using VMX
> even when not using the FPU for cases when it would cause inaccuracy.
> If you say we want accuracy and don't care about speed, then just
> don't disable hardfloat as it helps at least VMX and then we can add
> option to allow the user to say we can use hardfloat even if it's
> inaccurate then they can test their workload and decide for
> themselves.
> Regards,
> BALATON Zoltan

Alex Bennée

reply via email to

[Prev in Thread] Current Thread [Next in Thread]