[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [RFC PATCH 00/40] Sneak peek of virtio and dataplane ch
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-block] [RFC PATCH 00/40] Sneak peek of virtio and dataplane changes for 2.6 |
Date: |
Thu, 26 Nov 2015 11:39:20 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 26/11/2015 10:36, Christian Borntraeger wrote:
> For some unknown reason, this seems to be slightly slower than 2.5-rc1 on my
> old z196. (have not net tested the z13)
>
> your branch is certainly better regarding malloc, but worse regarding others.
Thanks for taking the time to test this!
This is correct, see the cover letter:
"[Patches 14 to 16 remove] the duplicate dataplane-specific
implementation of virtio in favor of the regular one that is already
used for non-dataplane. While the dataplane implementation is slightly
more optimized, I chose to keep the other one to avoid another "touch
all virtio devices" series.
Patch 10 alone mostly brings performance in par between the two.
The remaining 7-8% can be recovered by mostly getting rid of tiny
address_space_* operations, keeping the rings always mapped. Note that
the rest of this big series does bring a little performance improvement,
and already makes up for the lost performance."
The profile shows that the culprit is the repeated access
to the virtio ring:
3.99% qemu-system-s39 libc-2.18.so [.] __memcpy_z196
2.66% qemu-system-s39 qemu-system-s390x [.] address_space_lduw_le
2.51% qemu-system-s39 qemu-system-s390x [.] address_space_map
2.51% qemu-system-s39 qemu-system-s390x [.] phys_page_find
2.24% qemu-system-s39 qemu-system-s390x [.] qemu_get_ram_ptr
2.18% qemu-system-s39 qemu-system-s390x [.] address_space_translate_internal
1.91% qemu-system-s39 qemu-system-s390x [.] qemu_coroutine_switch
1.66% qemu-system-s39 qemu-system-s390x [.] address_space_rw
1.63% qemu-system-s39 qemu-system-s390x [.] address_space_stw_le
1.57% qemu-system-s39 qemu-system-s390x [.] address_space_stl_le
1.57% qemu-system-s39 qemu-system-s390x [.] address_space_translate
1.45% qemu-system-s39 qemu-system-s390x [.] virtqueue_pop
0.91% qemu-system-s39 qemu-system-s390x [.] qemu_ram_block_from_host
0.79% qemu-system-s39 qemu-system-s390x [.] vring_desc_read
0.76% qemu-system-s39 qemu-system-s390x [.] qemu_get_ram_block
-----------
28.33%
3.30% qemu-system-s39 libc-2.18.so [.] __memcpy_z196
2.83% qemu-system-s39 qemu-system-s390x [.] memory_region_find_rcu
2.72% qemu-system-s39 qemu-system-s390x [.] vring_pop
1.37% qemu-system-s39 qemu-system-s390x [.] address_space_rw
1.37% qemu-system-s39 qemu-system-s390x [.] qemu_get_ram_ptr
1.18% qemu-system-s39 qemu-system-s390x [.] memory_region_find
0.92% qemu-system-s39 qemu-system-s390x [.] get_desc.isra.11
0.92% qemu-system-s39 qemu-system-s390x [.] qemu_ram_block_from_host
0.84% qemu-system-s39 qemu-system-s390x [.] vring_push
-----------
15.45%
I would really prefer to get rid of vring.c as soon as the infrastructure
makes it possible---even if it's faster. We know what makes virtio.c
slower, and it's simpler to fix virtio.c than to convert all the other
models to vring.c _plus_ make vring.c safe for migration.
Paolo
- [Qemu-block] [PATCH 33/40] block: explicitly acquire aiocontext in bottom halves that need it, (continued)
- [Qemu-block] [PATCH 33/40] block: explicitly acquire aiocontext in bottom halves that need it, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 32/40] block: explicitly acquire aiocontext in callbacks that need it, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 37/40] async: optimize aio_bh_poll, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 36/40] aio: update locking documentation, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 35/40] block: explicitly acquire aiocontext in aio callbacks that need it, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 38/40] aio-posix: partially inline aio_dispatch into aio_poll, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 40/40] dma-helpers: avoid lock inversion with AioContext, Paolo Bonzini, 2015/11/24
- [Qemu-block] [PATCH 39/40] async: remove unnecessary inc/dec pairs, Paolo Bonzini, 2015/11/24
- Re: [Qemu-block] [Qemu-devel] [RFC PATCH 00/40] Sneak peek of virtio and dataplane changes for 2.6, Christian Borntraeger, 2015/11/26