[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-arm] [patch 1/1]about armv8's prefetch decode

From: Wangjintang
Subject: Re: [Qemu-devel] [Qemu-arm] [patch 1/1]about armv8's prefetch decode
Date: Sat, 25 Mar 2017 02:22:11 +0000

Hi Peter,
        More detail illustration at below.

> -----Original Message-----
> From: Peter Maydell [mailto:address@hidden
> Sent: Friday, March 24, 2017 6:06 PM
> To: Wangjintang
> Cc: Pranith Kumar; Shlomo Pongratz (A); Wanghaibin (Benjamin); qemu-arm;
> qemu-devel; Ori Chalak (A)
> Subject: Re: [Qemu-arm] [patch 1/1]about armv8's prefetch decode
> No, these changes look wrong. PRFM instructions do not need to
> do anything and should definitely not be emitting any intermediate
> code. In particular if you let execution fall through and try
> do_gpr_ld() then it will really do a load, which might cause
> an exception -- this is specifically forbidden for PRFM.
> Architecturally the ARM ARM says "it is valid for the PE to
> treat any or all prefetch instructions as a NOP", which is
> what QEMU does.
> The existing code is correct. In general you should not
> expect to be able to deduce the guest instructions from
> the intermediate code representation.

"it is valid for the PE to treat any or all prefetch instructions as a NOP", 
from software view, it's right.
the patch regard the prefetch as load instruction, at the same time 
don't affect rm/rt register. Only the PRFM instruction been emitted to
intermediate code and do a really load, then we can get the memory 
address relative to the prefetch instruction. Because the rm/rt register 
don't been modified, so the application can run correctly. 
BTW, the new added code default is disable. So for the common user, have no 
affect to them.

In our case, we need all the instruction trace & ld/st instruction's 
access memory address, the trace as the input for chip cycle-accurate 
model. Similar with flexus + qemu. 
Current code that skip generate prefetch instructions' intermediate code, 
So we can get prefetch instruction, but can't get the prefetch instruction 
relative memory address. 
We have tested that the ratio of prefetch instructions is about 2%~3% during 
run Dhrystone in system mode. The ratio is high.
________________                       ________________
|                |                     |                |
|                |                     |                |
|   Qemu        |                     |  chip          |
|                |   instruction trace    | cycle-accurate   |
|                |    ----------------->      | model          |
|                |   memory trace      |                |
|________________|                     |________________|

Ori Chalak's explain this as below:
" Indeed, prefetch instruction affects only the micro architecture, 
and hence not needed for running correctly the generated code.
However, we developed a performance simulator for a detailed 
ARMv8 CPU model, and use Qemu to resolve the functionality.
And for this purpose we need to translate all instructions that 
may affect the pipeline behavior, caches, etc.

This is not the major usage of Qemu, however there may be 
others doing this and it may help them.

Best Regards,
Wang jintang / Jed
Huawei Technologies Co., Ltd. 
Email: address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]