qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tb_flush() calls causing long Windows XP boot times


From: Alex Bennée
Subject: Re: tb_flush() calls causing long Windows XP boot times
Date: Fri, 11 Jun 2021 12:24:49 +0100
User-agent: mu4e 1.5.13; emacs 28.0.50

Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> writes:

> On 10/06/2021 14:14, Peter Maydell wrote:
>
>> On Thu, 10 Jun 2021 at 14:02, Programmingkid <programmingkidx@gmail.com> 
>> wrote:
>>>
>>> Hi Richard,
>>>
>>> There is a function called breakpoint_invalidate() in cpu.c that
>>> calls a function called tb_flush(). I have determined that this
>>> call is being made over 200,000 times when Windows XP boots.
>>> Disabling this function makes Windows XP boot way faster than
>>> before. The time went down from around 3 minutes to 20 seconds when
>>> I applied the patch below.
>>>
>>> After I applied the patch I ran several tests in my VM's to see if anything 
>>> broke. I could not find any problems. Here is the list my VM's I tested:
>>>
>>> Mac OS 10.8 in qemu-system-x86_64
>>> Windows 7 in qemu-system-x86_64
>>> Windows XP in qemu-system-i386
>>> Mac OS 10.4 in qemu-system-ppc
>>>
>>> I would be happy if the patch below was accepted but I would like to know 
>>> your thoughts.
>> 
>>>   cpu.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/cpu.c b/cpu.c
>>> index bfbe5a66f9..297c2e4281 100644
>>> --- a/cpu.c
>>> +++ b/cpu.c
>>> @@ -253,7 +253,7 @@ static void breakpoint_invalidate(CPUState *cpu, 
>>> target_ulong pc)
>>>        * Flush the whole TB cache to force re-translation of such TBs.
>>>        * This is heavyweight, but we're debugging anyway.
>>>        */
>>> -    tb_flush(cpu);
>>> +    /* tb_flush(cpu); */
>>>   }
>>>   #endif
>> The patch is clearly wrong -- this function is called when a CPU
>> breakpoint
>> is added or removed, and we *must* drop generated code which either
>> (a) includes code to take the breakpoint exception and now should not
>> or (b) doesn't include code to take the breakpoint exception and now should.
>> Otherwise we will incorrectly take/not take a breakpoint exception when
>> that stale code is executed.
>> As the comment notes, the assumption is that we won't be adding and
>> removing breakpoints except when we're debugging and therefore
>> performance is not critical. Windows XP is clearly doing something
>> we weren't expecting, so we should ideally have a look at whether
>> we can be a bit more efficient about not throwing the whole TB
>> cache away.
>
> FWIW this was reported a while back on Launchpad as
> https://bugs.launchpad.net/qemu/+bug/1883593 where the performance
> regression was traced back to:
>
> commit b55f54bc965607c45b5010a107a792ba333ba654
> Author: Max Filippov <jcmvbkbc@gmail.com>
> Date:   Wed Nov 27 14:06:01 2019 -0800
>
>     exec: flush CPU TB cache in breakpoint_invalidate
>
>     When a breakpoint is inserted at location for which there's currently no
>     virtual to physical translation no action is taken on CPU TB cache. If a
>     TB for that virtual address already exists but is not visible ATM the
>     breakpoint won't be hit next time an instruction at that address will be
>     executed.
>
>     Flush entire CPU TB cache in breakpoint_invalidate to force
>     re-translation of all TBs for the breakpoint address.
>
>     This change fixes the following scenario:
>     - linux user application is running
>     - a breakpoint is inserted from QEMU gdbstub for a user address that is
>       not currently present in the target CPU TLB
>     - an instruction at that address is executed, but the external debugger
>       doesn't get control.
>
>     Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>     Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
>     Message-Id: <20191127220602.10827-2-jcmvbkbc@gmail.com>
>     Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

This was a reversion of a reversion (see 406bc339b0 and a9353fe89). So
we've bounced between solutions several times. Fundamentally if it gets
tricky to ensure your translated state matches the actual machine state
the easiest option is to throw everything away and start again.

> Perhaps Windows XP is constantly executing some kind of breakpoint
> instruction or updating some CPU debug registers during boot?

It would be useful to identify what exactly is triggering the changes
here. If it's old fashioned breakpoint instructions being inserted into
memory we will need to ensure our invalidating of old translations is
rock solid. If we are updating our internal breakpoint/watchpoint
tracking as a result of the guest messing with debug registers maybe we
can do something a bit more finessed?

>
>
> ATB,
>
> Mark.


-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]