Re: [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp

From:	Emilio G. Cota
Subject:	Re: [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp
Date:	Tue, 17 Oct 2017 16:04:51 -0400
User-agent:	Mutt/1.5.24 (2015-08-30)

On Mon, Oct 16, 2017 at 10:25:20 -0700, Richard Henderson wrote:
> From: Richard Henderson <address@hidden>
> 
> Rather than have a separate buffer of 10*max_ops entries,
> give each opcode 10 entries.  The result is actually a bit
> smaller and should have slightly more cache locality.
> 
> Signed-off-by: Richard Henderson <address@hidden>

Reviewed-by: Emilio G. Cota <address@hidden>

This gives a small yet measurable perf advantage when booting linux:

 Performance counter stats for 'taskset -c 0 
aarch64-softmmu/qemu-system-aarch64 \
        -M virt,gic_version=3 -cpu cortex-a57 -nographic -m 4096 -netdev \
        user,id=unet,hostfwd=tcp::2222-:22 -device 
virtio-net-device,netdev=unet \
        -drive file=jessie-arm64-die-on-boot.qcow2,id=myblock,index=0,if=none \
        -device virtio-blk-device,drive=myblock -kernel \
        aarch64-current-linux-kernel-only.img \
        -append console=ttyAMA0 root=/dev/vda1 -smp 1' (10 runs):

Before:
       7182.556704      task-clock (msec)         #    0.999 CPUs utilized      
      ( +-  0.11% )
            21,710      context-switches          #    0.003 M/sec              
      ( +-  0.12% )
                 1      cpu-migrations            #    0.000 K/sec              
      ( +- 11.11% )
             7,929      page-faults               #    0.001 M/sec              
      ( +-  1.75% )
    30,280,536,799      cycles                    #    4.216 GHz                
      ( +-  0.11% )
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
    54,481,515,301      instructions              #    1.80  insns per cycle    
      ( +-  0.09% )
     9,655,822,880      branches                  # 1344.343 M/sec              
      ( +-  0.10% )
       170,594,899      branch-misses             #    1.77% of all branches    
      ( +-  0.10% )

       7.190274755 seconds time elapsed                                         
 ( +-  0.11% )


After:
       7086.254881      task-clock (msec)         #    0.999 CPUs utilized      
      ( +-  0.13% )
            21,598      context-switches          #    0.003 M/sec              
      ( +-  0.07% )
                 1      cpu-migrations            #    0.000 K/sec              
    
             8,099      page-faults               #    0.001 M/sec              
      ( +-  0.97% )
    29,856,727,544      cycles                    #    4.213 GHz                
      ( +-  0.12% )
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
    53,585,205,542      instructions              #    1.79  insns per cycle    
      ( +-  0.10% )
     9,638,601,205      branches                  # 1360.183 M/sec              
      ( +-  0.10% )
       169,785,181      branch-misses             #    1.76% of all branches    
      ( +-  0.08% )

       7.094560954 seconds time elapsed

That is, a 1.33% perf improvement.

                Emilio

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v6 00/50] tcg tb_lock removal, Richard Henderson, 2017/10/16
- [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp, Richard Henderson, 2017/10/16
  - Re: [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp, Emilio G. Cota <=
- [Qemu-devel] [PATCH v6 04/50] tcg: Propagate TCGOp down to allocators, Richard Henderson, 2017/10/16
  - Re: [Qemu-devel] [PATCH v6 04/50] tcg: Propagate TCGOp down to allocators, Emilio G. Cota, 2017/10/17
- [Qemu-devel] [PATCH v6 03/50] tcg: Propagate args to op->args in tcg.c, Richard Henderson, 2017/10/16
  - Re: [Qemu-devel] [PATCH v6 03/50] tcg: Propagate args to op->args in tcg.c, Emilio G. Cota, 2017/10/17
- [Qemu-devel] [PATCH v6 02/50] tcg: Propagate args to op->args in optimizer, Richard Henderson, 2017/10/16
  - Re: [Qemu-devel] [PATCH v6 02/50] tcg: Propagate args to op->args in optimizer, Emilio G. Cota, 2017/10/17
    - Re: [Qemu-devel] [PATCH v6 02/50] tcg: Propagate args to op->args in optimizer, Richard Henderson, 2017/10/17
- [Qemu-devel] [PATCH v6 05/50] tcg: Introduce arg_temp, Richard Henderson, 2017/10/16
  - Re: [Qemu-devel] [PATCH v6 05/50] tcg: Introduce arg_temp, Emilio G. Cota, 2017/10/17
- [Qemu-devel] [PATCH v6 06/50] tcg: Add temp_global bit to TCGTemp, Richard Henderson, 2017/10/16

Prev by Date: Re: [Qemu-devel] [PATCH v2 00/13] More fully implement ARM PMUv3
Next by Date: Re: [Qemu-devel] [PATCH v6 02/50] tcg: Propagate args to op->args in optimizer
Previous by thread: [Qemu-devel] [PATCH v6 01/50] tcg: Merge opcode arguments into TCGOp
Next by thread: [Qemu-devel] [PATCH v6 04/50] tcg: Propagate TCGOp down to allocators
Index(es):
- Date
- Thread