qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation


From: Helge Deller
Subject: Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation
Date: Thu, 14 Sep 2023 23:19:44 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

Hi Richard,

On 9/13/23 22:30, Richard Henderson wrote:
On 9/13/23 10:19, Helge Deller wrote:
On 9/13/23 18:55, Richard Henderson wrote:
On 9/13/23 07:47, Helge Deller wrote:
+        haddr = (uint32_t *)((uintptr_t)vaddr);
+        old = *haddr;

This is horribly incorrect, both for user-only and system mode.

Richard, thank you for the review!
But would you mind explaining why this is incorrect?
I thought the "vaddr = probe_access()" calculates the host address, so
shouldn't it be the right address?

The vaddr name is confusing (since it implies virtual address, which
the return from probe_access is not) as are the casts, which are
unnecessary.

Still, I think my code isn't as wrong as you said.

But I tend to agree with you on this:
Frankly, the current tcg_gen_atomic_xchg_reg is as optimized as
you'll be able to do.
tcg_gen_atomic_xchg_reg() seems to generate on x86-64:

0000000000525160 <helper_atomic_xchgl_be>:
  525160:       53                      push   %rbx
  525161:       4c 8b 44 24 08          mov    0x8(%rsp),%r8
  525166:       89 d3                   mov    %edx,%ebx
  525168:       89 ca                   mov    %ecx,%edx
  52516a:       b9 04 00 00 00          mov    $0x4,%ecx
  52516f:       e8 1c a6 ff ff          call   51f790 <atomic_mmu_lookup>
  525174:       48 89 c2                mov    %rax,%rdx
  525177:       89 d8                   mov    %ebx,%eax
  525179:       0f c8                   bswap  %eax
  52517b:       87 02                   xchg   %eax,(%rdx)
  52517d:       5b                      pop    %rbx
  52517e:       0f c8                   bswap  %eax
  525180:       c3                      ret

and atomic_mmu_lookup() is basically the same as probe_access(),
so there is probably no gain in my patch.

Please ignore my patch.

Thank you!
Helge



reply via email to

[Prev in Thread] Current Thread [Next in Thread]