Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set s

From: Sergey Fedorov
Subject: Re: [Qemu-devel] [PATCH v5 07/18] qemu-thread: add simple test-and-set spinlock
Date: Wed, 18 May 2016 18:59:34 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2

On 18/05/16 18:44, Peter Maydell wrote:
> On 18 May 2016 at 16:36, Paolo Bonzini <address@hidden> wrote:
>> On 18/05/2016 17:35, Peter Maydell wrote:
>>>>> $ arm-linux-gnueabi-gcc -march=armv6 -O2 -c a.c
>>> I don't think armv6 is a sufficiently common host for us to
>>> worry too much about how its atomic primitives come out.
>>> ARMv7 and 64-bit ARMv8 are more relevant, I think.
>>> (v7 probably gets compiled the same way as v6 here, though.)
>> Well, v6 is raspberry pi isn't it?
> Yes, but v6 is also pretty slow anyhow, and if it wasn't
> for the outlier raspi case then v6 would be definitely
> irrelevant to everybody. Running QEMU on a slow ARM
> board is unlikely to be a great experience regardless.
> I'm not saying we should happily break v6, but I think
> we're better off making optimisation decisions looking
> forwards at v7 and v8 boards, rather than backwards at a
> single legacy v6 board.

Well, ARMv7 code looks like exactly the same except we have "dmb sy"
instead of "mcr 15, 0, r0, cr7, cr10, {5}".

Here is ARMv8 code for reference:

a.o:     file format elf64-littleaarch64

Disassembly of section .text:

    0000000000000000 <atomic_exchange>:
       0:    885ffc02     ldaxr    w2, [x0]
       4:    88037c01     stxr    w3, w1, [x0]
       8:    35ffffc3     cbnz    w3, 0 <atomic_exchange>
       c:    2a0203e0     mov    w0, w2
      10:    d65f03c0     ret

    0000000000000014 <atomic_compare_exchange>:
      14:    d10043ff     sub    sp, sp, #0x10
      18:    b9000fe1     str    w1, [sp,#12]
      1c:    885ffc03     ldaxr    w3, [x0]
      20:    6b01007f     cmp    w3, w1
      24:    54000061     b.ne    30 <atomic_compare_exchange+0x1c>
      28:    88047c02     stxr    w4, w2, [x0]
      2c:    6b1f009f     cmp    w4, wzr
      30:    1a9f17e0     cset    w0, eq
      34:    910043ff     add    sp, sp, #0x10
      38:    d65f03c0     ret

    000000000000003c <sync_val_compare_and_swap>:
      3c:    885ffc01     ldaxr    w1, [x0]
      40:    6b1f003f     cmp    w1, wzr
      44:    54000061     b.ne    50 <sync_val_compare_and_swap+0x14>
      48:    8803fc02     stlxr    w3, w2, [x0]
      4c:    35ffff83     cbnz    w3, 3c <sync_val_compare_and_swap>
      50:    6b1f003f     cmp    w1, wzr
      54:    1a9f07e0     cset    w0, ne
      58:    d65f03c0     ret

    000000000000005c <sync_lock_test_and_set>:
      5c:    885ffc02     ldaxr    w2, [x0]
      60:    88037c01     stxr    w3, w1, [x0]
      64:    35ffffc3     cbnz    w3, 5c <sync_lock_test_and_set>
      68:    d65f03c0     ret

and x86-64 as well (but I'm not good at reading x86 code):

a.o:     file format elf64-x86-64

Disassembly of section .text:

    0000000000000000 <atomic_exchange>:
       0:   89 f0                   mov    %esi,%eax
       2:   87 07                   xchg   %eax,(%rdi)
       4:   c3                      retq  
       5:   66 66 2e 0f 1f 84 00    data32 nopw %cs:0x0(%rax,%rax,1)
       c:   00 00 00 00

    0000000000000010 <atomic_compare_exchange>:
      10:   89 f0                   mov    %esi,%eax
      12:   89 74 24 fc             mov    %esi,-0x4(%rsp)
      16:   f0 0f b1 17             lock cmpxchg %edx,(%rdi)
      1a:   0f 94 c0                sete   %al
      1d:   c3                      retq  
      1e:   66 90                   xchg   %ax,%ax

    0000000000000020 <sync_val_compare_and_swap>:
      20:   31 c0                   xor    %eax,%eax
      22:   f0 0f b1 17             lock cmpxchg %edx,(%rdi)
      26:   85 c0                   test   %eax,%eax
      28:   0f 95 c0                setne  %al
      2b:   c3                      retq  
      2c:   0f 1f 40 00             nopl   0x0(%rax)

    0000000000000030 <sync_lock_test_and_set>:
      30:   87 37                   xchg   %esi,(%rdi)
      32:   c3                      retq  

Kind regards,

