|
From: | Avi Kivity |
Subject: | Re: [Qemu-devel] [PATCH RFC] virtio: put last seen used index into ring itself |
Date: | Sun, 23 May 2010 19:03:10 +0300 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100330 Fedora/3.0.4-1.fc12 Thunderbird/3.0.4 |
On 05/23/2010 06:51 PM, Michael S. Tsirkin wrote:
So locked version seems to be faster than unlocked, and share/unshare not to matter?May be due to the processor using the LOCK operation as a hint to reserve the cacheline for a bit.Maybe we should use atomics on index then?
This should only be helpful if you access the cacheline several times in a row. That's not the case in virtio (or here).
I think the problem is that LOCKSHARE and SHARE are not symmetric, so they can't be directly compared.
OK, after adding mb in code patch will be sent separately, the test works for my workstation. locked is still fastest, unshared sometimes shows wins and sometimes loses over shared. address@hidden ~]# ./cachebounce share 0 1 CPU 0: share cacheline: 6638521 usec CPU 1: share cacheline: 6638478 usec
66 ns? nice.
address@hidden ~]# ./cachebounce share 0 2 CPU 0: share cacheline: 14529198 usec CPU 2: share cacheline: 14529156 usec
140 ns, not too bad. I hope I'm not misinterpreting the results. -- error compiling committee.c: too many arguments to function
[Prev in Thread] | Current Thread | [Next in Thread] |