qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and c


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Date: Fri, 09 Aug 2013 09:10:26 -0500
User-agent: Notmuch/0.15.2+202~g0c4b8aa (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu)

Rusty Russell <address@hidden> writes:

> Anthony Liguori <address@hidden> writes:
>> I suspect this is a premature optimization.  With a weak function called
>> directly in the accessors below, I suspect you would see no measurable
>> performance overhead compared to this approach.
>>
>> It's all very predictable so the CPU should do a decent job optimizing
>> the if () away.
>
> Perhaps.  I was leery of introducing performance regressions, but the
> actual I/O tends to dominate anyway.
>
> So I tested this, by adding the patch (below) and benchmarking
> qemu-system-i386 on my laptop before and after.
>
> Setup: Intel(R) Core(TM) i5 CPU       M 560  @ 2.67GHz
> (Performance cpu governer enabled)
> Guest: virtio user net, virtio block on raw file, 1 CPU, 512MB RAM.
> (Qemu run under eatmydata to eliminate syncs)

FYI, cache=unsafe is equivalent to using eatmydata.

> First test: ping -f -c 10000 -q 10.0.2.0 (100 times)
> (Ping chosen since packets stay in qemu's user net code)
>
> BEFORE:
>         MIN: 824ms
>         MAX: 914ms
>         AVG: 876.95ms
>         STDDEV: 16ms
>
> AFTER:
>         MIN: 872ms
>         MAX: 933ms
>         AVG: 904.35ms
>         STDDEV: 15ms

I can reproduce this although I also see a larger standard deviation.

BEFORE:
        MIN: 496
        MAX: 1055
        AVG: 873.22
        STDEV: 136.88

AFTER:
        MIN: 494
        MAX: 1456
        AVG: 947.77
        STDEV: 150.89

In my datasets, the stdev is higher in the after case implying that
there is more variation.  Indeed, the MIN is pretty much the same.

GCC is inlining the functions, I'm still surprised that it's measurable
at all.

At any rate, I think the advantage of not increasing the amount of
target specific code outweighs the performance difference here.  As you
said, if there is real I/O, the differences isn't noticable.

Regards,

Anthony Liguori



reply via email to

[Prev in Thread] Current Thread [Next in Thread]