[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-riscv] [Qemu-devel] [PATCH] RISCV: support riscv vector extens

From: Richard Henderson
Subject: Re: [Qemu-riscv] [Qemu-devel] [PATCH] RISCV: support riscv vector extension 0.7.1
Date: Tue, 3 Sep 2019 07:38:13 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 9/2/19 12:45 AM, liuzhiwei wrote:
> On 2019/8/29 下午11:09, Richard Henderson wrote:
>> On 8/29/19 5:45 AM, liuzhiwei wrote:
>>> Even in qemu,  it may be some situations that VSTART != 0. For example, a 
>>> load
>>> instruction leads to a page fault exception in a middle position. If VSTART 
>>> ==
>>> 0,  some elements that had been loaded before the exception will be loaded 
>>> once
>>> again.
>> Alternately, you can validate all of the pages before performing any memory
>> operations.  At which point there will never be an exception in the middle.
> As a vector instruction may access memory  across many pages,  is there any 
> way
> to validate the pages? Page table walk ?Or some TLB APIs?

Yes, there are TLB APIs.  Several of them, depending on what is needed.

> #0  cpu_watchpoint_address_matches (wp=0x555556228110, addr=536871072, len=1)
> at qemu/exec.c:1094
> #1  0x000055555567204f in check_watchpoint (offset=160, len=1, attrs=...,
> flags=2) at qemu/exec.c:2803
> #2  0x0000555555672379 in watch_mem_write (opaque=0x0, addr=536871072, 
> val=165,
> size=1, attrs=...) at qemu/exec.c:2878
> #3  0x00005555556d44bb in memory_region_write_with_attrs_accessor
> (mr=0x5555561292e0 <io_mem_watch>, addr=536871072, value=0x7fffedffe2c8,
> size=1, shift=0, mask=255, attrs=...)
>     at qemu/memory.c:553
> #4  0x00005555556d45de in access_with_adjusted_size (addr=536871072,
> value=0x7fffedffe2c8, size=1, access_size_min=1, access_size_max=8,
> access_fn=0x5555556d43cd <memory_region_write_with_attrs_accessor>,
>     mr=0x5555561292e0 <io_mem_watch>, attrs=...) at qemu/memory.c:594
> #5  0x00005555556d7247 in memory_region_dispatch_write (mr=0x5555561292e0
> <io_mem_watch>, addr=536871072, data=165, size=1, attrs=...) at 
> qemu/memory.c:1480
> #6  0x00005555556f0d13 in io_writex (env=0x5555561efb58,
> iotlbentry=0x5555561f5398, mmu_idx=1, val=165, addr=536871072, retaddr=0,
> recheck=false, size=1) at qemu/accel/tcg/cputlb.c:909
> #7  0x00005555556f19a6 in io_writeb (env=0x5555561efb58, mmu_idx=1, index=0,
> val=165 '\245', addr=536871072, retaddr=0, recheck=false) at
> qemu/accel/tcg/softmmu_template.h:268
> #8  0x00005555556f1b54 in helper_ret_stb_mmu (env=0x5555561efb58,
> addr=536871072, val=165 '\245', oi=1, retaddr=0) at
> qemu/accel/tcg/softmmu_template.h:304
> #9  0x0000555555769f06 in cpu_stb_data_ra (env=0x5555561efb58, ptr=536871072,
> v=165, retaddr=0) at qemu/include/exec/cpu_ldst_template.h:182
> #10 0x0000555555769f80 in cpu_stb_data (env=0x5555561efb58, ptr=536871072,
> v=165) at /qemu/include/exec/cpu_ldst_template.h:194
> #11 0x000055555576a913 in csky_cpu_stb_data (env=0x5555561efb58,
> vaddr=536871072, data=165 '\245') at qemu/target/csky/csky_ldst.c:48
> #12 0x000055555580ba7d in helper_vdsp2_vstru_n (env=0x5555561efb58,
> insn=4167183360) at qemu/target/csky/op_vdsp2.c:1317
> The path is not related to probe_write in the patch().

Of course.  It wasn't supposed to be.

> Could you give more details or a test case where watchpoint doesn't work
> correctly?

If the store partially, but not completely, overlaps the watchpoint.  This is
obviously much easier to do with large vector operations than with normal
integer operations.

In this case, we may have completed some of the stores before encountering the
watchpoint.  Which, inside check_watchpoint(), will longjmp back to the cpu
main loop.  Now we have a problem: the store is partially complete and it
should not be.

Therefore, we now have patches queued in tcg-next that adjust probe_write to
perform both access and watchpoint tests.  There is still target-specific code
that must be adjusted to match, so there are not currently any examples in the
tree to show.

However, the idea is:
  (1) Instructions that perform more than one host store must probe
      the entire range to be stored before performing any stores.

  (2) Instructions that perform more than one host load must either
      probe the entire range to be loaded, or collect the data in
      temporary storage.  If not using probes, writeback to the
      register file must be delayed until after all loads are done.

  (3) Any one probe may not cross a page boundary; splitting of the
      access across pages must be done by the helper.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]