qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL


From: Gonglei (Arei)
Subject: Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL
Date: Fri, 9 Feb 2018 10:05:03 +0000

>
> > >
> > > $ cat strace_c.sh
> > > strace -tt -p $1 -c -o result_$1.log &
> > > sleep $2
> > > pid=$(pidof strace)
> > > kill $pid
> > > cat result_$1.log
> > >
> > > Before appling this change:
> > > $ ./strace_c.sh 10528 30
> > > % time     seconds  usecs/call     calls    errors syscall
> > > ------ ----------- ----------- --------- --------- ----------------
> > >  93.87    0.119070          30      4000           ppoll
> > >   3.27    0.004148           2      2038           ioctl
> > >   2.66    0.003370           2      2014           futex
> > >   0.09    0.000113           1       106           read
> > >   0.09    0.000109           1       104           io_getevents
> > >   0.02    0.000029           1        30           poll
> > >   0.00    0.000000           0         1           write
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 100.00    0.126839                  8293           total
> > >
> > > After appling the change:
> > > $ ./strace_c.sh 23829 30
> > > % time     seconds  usecs/call     calls    errors syscall
> > > ------ ----------- ----------- --------- --------- ----------------
> > >  92.86    0.067441          16      4094           ppoll
> > >   4.85    0.003522           2      2136           ioctl
> > >   1.17    0.000850           4       189           futex
> > >   0.54    0.000395           2       202           read
> > >   0.52    0.000379           2       202           io_getevents
> > >   0.05    0.000037           1        30           poll
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 100.00    0.072624                  6853           total
> > >
> > > The futex call number decreases ~90.6% on an idle windows 7 guest.
> >
> > These are the same figures as from v1 -- it would be interesting
> > to check whether the additional locking that v2 adds has affected
> > the results.
> >
> Oh, yes. the futex number of v2 don't decline compared too much to v1 because
> it
> takes the BQL before raising the outbound IRQ line now.
> 
> Before applying v2:
> # ./strace_c.sh 8776 30
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  78.01    0.164188          26      6436           ppoll
>   8.39    0.017650           5      3700        39 futex
>   7.68    0.016157           6      2758           ioctl
>   5.48    0.011530           3      4586      1113 read
>   0.30    0.000640          20        32           io_submit
>   0.15    0.000317           4        89           write
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.210482                 17601      1152 total
> 
> After applying v2:
> # ./strace_c.sh 15968 30
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  78.28    0.171117          27      6272           ppoll
>   8.50    0.018571           5      3663        21 futex
>   7.76    0.016973           6      2732           ioctl
>   4.85    0.010597           3      4115       853 read
>   0.31    0.000672          11        63           io_submit
>   0.30    0.000659           4       180           write
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.218589                 17025       874 total
> 
> > Does the patch improve performance in a more interesting use
> > case than "the guest is just idle" ?
> >
> I think so, after all, the scope of the locking is reduced .
> Besides this, can we optimize the rtc timer to avoid to hold BQL
> by separate threads?
> 
Hi Peter, Paolo

I tested PCMark 8 (https://www.futuremark.com/benchmarks/pcmark) 
in win7 guest and got the below results:

Guest: 2U2G

Before applying v2:

Your Work 2.0 score:       2000
Web Browsing - JunglePin    0.334s
Web Browsing - Amazonia    0.132s
Writing        3.59s
Spreadsheet    70.13s
Video Chat v2/Video Chat playback 1 v2   22.8 fps
Video Chat v2/Video Chat encoding v2   307.0 ms
Benchmark duration    1h 35min 46s

After applying v2:

Your Work 2.0 score:       2040
Web Browsing - JunglePin    0.345s
Web Browsing - Amazonia    0.132s
Writing        3.56s
Spreadsheet    67.83s
Video Chat v2/Video Chat playback 1 v2   28.7 fps
Video Chat v2/Video Chat encoding v2   324.7 ms
Benchmark duration    1h 32min 5s

Test results show that optimization is very effective in stressful situations.

Thanks,
-Gonglei


reply via email to

[Prev in Thread] Current Thread [Next in Thread]