qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v4 00/28] Base enabling patches for MTTCG


From: Alex Bennée
Subject: Re: [Qemu-devel] [RFC v4 00/28] Base enabling patches for MTTCG
Date: Fri, 12 Aug 2016 16:01:50 +0100
User-agent: mu4e 0.9.17; emacs 25.1.4

G 3 <address@hidden> writes:

> On Aug 12, 2016, at 9:19 AM, Alex Bennée wrote:
>
>> On 11 August 2016 at 17:43, G 3 <address@hidden> wrote:
>>>
>>> On Aug 11, 2016, at 11:24 AM, address@hidden wrote:
>>>
>>>
>>> Performance
>>>
>>> ===========
>>>
>>>
>>> You can't do full work-load testing on this tree due to the lack of
>>>
>>> atomic support (but I will run some numbers on
>>>
>>> mttcg/base-patches-v4-with-cmpxchg-atomics-v2). However you certainly
>>>
>>> see a run time improvement with the kvm-unit-tests TCG group.
>>>
>>>
>>>   retry.py called with ['./run_tests.sh', '-t', '-g', 'tcg', '-o',
>>> '-accel
>>> tcg,thread=single']
>>>
>>>   run 1: ret=0 (PASS), time=1047.147924 (1/1)
>>>
>>>   run 2: ret=0 (PASS), time=1071.921204 (2/2)
>>>
>>>   run 3: ret=0 (PASS), time=1048.141600 (3/3)
>>>
>>>   Results summary:
>>>
>>>   0: 3 times (100.00%), avg time 1055.737 (196.70 varience/14.02
>>> deviation)
>>>
>>>   Ran command 3 times, 3 passes
>>>
>>>   retry.py called with ['./run_tests.sh', '-t', '-g', 'tcg', '-o',
>>> '-accel
>>> tcg,thread=multi']
>>>
>>>   run 1: ret=0 (PASS), time=303.074210 (1/1)
>>>
>>>   run 2: ret=0 (PASS), time=304.574991 (2/2)
>>>
>>>   run 3: ret=0 (PASS), time=303.327408 (3/3)
>>>
>>>   Results summary:
>>>
>>>   0: 3 times (100.00%), avg time 303.659 (0.65 varience/0.80
>>> deviation)
>>>
>>>   Ran command 3 times, 3 passes
>>>
>>>
>>> The TCG tests run with -smp 4 on my system. While the TCG tests are
>>>
>>> purely CPU bound they do exercise the hot and cold paths of TCG
>>>
>>> execution (especially when triggering SMC detection). However
>>> there is
>>>
>>> still a benefit even with a 50% overhead compared to the ideal 263
>>>
>>> second elapsed time.
>>>
>>>
>>> Alex
>>>
>>>
>>>
>>> Your tests results look very promising. It looks like you saw a 3x
>>> speed
>>> improvement over single threading. Excellent. I wonder what the
>>> numbers
>>> would be for a 22 core Xeon or 72 core Xeon Phi...
>>
>> Well the initial results look like they tail off but I need to test
>> on a more
>> capable machine. I'm going to package up the test case first so people
>> can easily
>> replicate the test.
>>
>>> Do you think you could some test with an x86 guest like Windows
>>> XP? There
>>> are plenty of benchmark tests for this platform. Video encoding,
>>> Youtube
>>> video playback, and number crunching programs' results would be very
>>> interesting to see.
>>
>> I don't have any Windows images to hand I'm afraid. Besides Windows
>> is a fairly
>> boring guest from this point of view because:
>>
>>   - it's x86, so why use TCG over KVM
>>   - QEMU TCG generally sucks at media bencmarks due to SIMD emulation
>
> Mac OS X host don't have a hypervisor that QEMU supports (VirtualBox
> isn't supported), so TCG is the only thing that can be used. Maybe a
> free x86 guest like Linux could be used?

Sounds like you have the kit for this test case. Let me know if the
branch boots your test images?

--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]