qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor


From: Gonglei (Arei)
Subject: Re: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor
Date: Fri, 5 Feb 2016 06:16:41 +0000




Regards,
-Gonglei


> -----Original Message-----
> From: Paolo Bonzini [mailto:address@hidden
> Sent: Thursday, February 04, 2016 6:18 PM
> To: Gonglei (Arei); address@hidden
> Cc: address@hidden; address@hidden
> Subject: Re: [PATCH 07/10] virtio: combine the read of a descriptor
> 
> 
> 
> On 04/02/2016 08:48, Gonglei (Arei) wrote:
> > 11.44%  qemu-kvm                 [.] memory_region_find
> >   6.31%  qemu-kvm                 [.] qemu_get_ram_ptr
> >   4.61%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
> >   3.54%  qemu-kvm                 [.] qemu_ram_addr_from_host
> >   2.80%  libpthread-2.19.so       [.] pthread_mutex_lock
> >   2.55%  qemu-kvm                 [.] object_unref
> >   2.49%  libc-2.19.so             [.] malloc
> >   2.47%  libc-2.19.so             [.] _int_malloc
> >   2.34%  libc-2.19.so             [.] _int_free
> >   2.18%  qemu-kvm                 [.] object_ref
> >   2.18%  qemu-kvm                 [.] address_space_translate
> >   2.03%  libc-2.19.so             [.] __memcpy_sse2_unaligned
> >   1.76%  libc-2.19.so             [.] malloc_consolidate
> >   1.56%  qemu-kvm                 [.] addrrange_intersection
> >   1.52%  qemu-kvm                 [.] vring_pop
> >   1.36%  qemu-kvm                 [.] find_next_zero_bit
> >   1.30%  [kernel]                 [k] native_write_msr_safe
> >   1.29%  qemu-kvm                 [.] addrrange_intersects
> >   1.21%  qemu-kvm                 [.] vring_map
> >   0.93%  qemu-kvm                 [.] virtio_notify
> >
> > Do you have any thoughts to decrease the cpu overhead and get higher
> through output? Thanks!
> 
> Using bigger chunks than 256 bytes will reduce the overhead in
> memory_region_find and qemu_get_ram_ptr.  You could expect
>  a further 10-12% improvement.
> 
Yes, you're right. This is the testing result:

        Encrypting in chunks of 256 bytes: done. 386.89 MiB in 5.02 secs: 77.13 
MiB/sec (1584701 packets)
        Encrypting in chunks of 512 bytes: done. 756.80 MiB in 5.02 secs: 
150.86 MiB/sec (1549918 packets)
        Encrypting in chunks of 1024 bytes: done. 1.30 GiB in 5.02 secs: 0.26 
GiB/sec (1358614 packets)
        Encrypting in chunks of 2048 bytes: done. 2.42 GiB in 5.02 secs: 0.48 
GiB/sec (1270223 packets)
        Encrypting in chunks of 4096 bytes: done. 3.99 GiB in 5.02 secs: 0.79 
GiB/sec (1046680 packets)
        Encrypting in chunks of 8192 bytes: done. 6.12 GiB in 5.02 secs: 1.22 
GiB/sec (802379 packets)
        Encrypting in chunks of 16384 bytes: done. 8.48 GiB in 5.04 secs: 1.68 
GiB/sec (556046 packets)
        Encrypting in chunks of 32768 bytes: done. 10.42 GiB in 5.07 secs: 2.06 
GiB/sec (341524 packets)

But 256-byte packet is the main chunk size of packet in CT scenarios.

Regards,
-Gonglei



reply via email to

[Prev in Thread] Current Thread [Next in Thread]