Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framewor

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framewor

From:	Li, Liang Z
Subject:	Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework
Date:	Fri, 6 May 2016 01:33:14 +0000

> This series of patches provides a framework for testing migration
> performance characteristics. The motivating factor for this is planning that 
> is
> underway in OpenStack wrt making use of QEMU migration features such as
> compression, auto-converge and post-copy. The primary aim for OpenStack
> is to have Nova autonomously manage migration features & tunables to
> maximise chances that migration will complete. The problem faced is figuring
> out just which QEMU migration features are "best" suited to our needs. This
> means we want data on how well they are able to ensure completion of a
> migration, against the host resources used and the impact on the guest
> workload performance.
> 
> The test framework produced here takes a pathelogical guest workload
> (every CPU just burning 100% of time xor'ing every byte of guest memory
> with random data). This is quite a pessimistic test because most guest
> workloads are not giong to be this heavy on memory writes, and their data
> won't be uniformly random and so will be able to compress better than this
> test does.
> 


Wonderful test report!

> With this worst case guest, I have produced a set of tests using UNIX socket,
> TCP localhost, TCP remote and RDMA remote socket transports, with both a
> 1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.
> 
> The TCP/RDMA remote host tests were run over a 10-GiG-E network
> interface.
> 
> I have put the results online to view here:
> 
>   https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/
> 
> The charts here are showing two core sets of data:
> 
>  - The guest CPU performance. The left axis is showing the time in
> milliseconds
>    required to xor 1 GB of memory. This is shown per-guest CPU and
> combined all
>    CPUs.
> 
>  - The host CPU utilization. The right axis is showing the overall QEMU 
> process
>    CPU utilization, and the per-VCPU utilization.
> 
> Note that the charts are interactive - you can turn on/off each plot line and
> zoom in by selecting regions on the chart.
> 
> 
> Some interesting things that I have observed with this
> 
>  - At the start of each iteration of migration there is a distinct drop in
>    guest CPU performance as shown by a spike in the guest CPU time lines.
>    Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
>    related to QEMU recalculating the dirty bitmap for the guest RAM. See
>    the spikes in the green line in:
> 
>     https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html
> 
>  - For the larger sized guests, the auto-converge code has to throttle the
>    guest to as much as 90% or more before it is able to meet the 500ms max
>    downtime value
> 
>     https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html
> 
>    Even then I often saw tests aborting as they hit the max number of
>    iterations I permitted (30 iters max)
> 
>     https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html
> 
>  - MT compression is actively harmful to chances of successful migration
> when
>    the guest RAM is not compression friendly. My work load is worst case
> since
>    it is splattering RAM with totally random bytes. The MT compression is
>    dramatically increasing the time for each iteration as we bottleneck on CPU
>    compression speed, leaving the network largely idle. This causes migration
>    which would have completed without compression, to fail. It also burns
> huge
>    amounts of host CPU time
> 
>      https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html
> 
>  - XBZRLE compression did not have as much of a CPU peformance penalty on
> the
>    host as MT comprssion, but also did not help migration to actually 
> complete.
>    Again this is largely due to the workload being the worst case scenario 
> with
>    random bytes. The downside is obviously the potentially significant
> memory
>    overhead on the host due to the cache sizing
> 
>     https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html
> 
> 
>  - Post-copy, by its very nature, obviously ensured that the migraton would
>    complete. While post-copy was running in pre-copy mode there was a
> somewhat
>    chaotic small impact on guest CPU performance, causing performance to
>    periodically oscillate between 400ms/GB and 800ms/GB. This is less than
>    the impact at the start of each migration iteration which was 1000ms/GB
>    in this test. There was also a massive penalty at time of switchover from
>    pre to post copy, as to be expected. The migration completed in post-copy
>    phase quite quickly though. For this workload, number of iterations in
>    pre-copy mode before switching to post-copy did not have much impact. I
>    expect a less extreme workload would have shown more interesting results
>    wrt number of iterations of pre-copy:
> 
>     https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-8gb-4cpu/post-copy-iters.html
> 
> 
> Overall, if we're looking for a solution that can guarantee completion under
> the most extreme guest workload, then only post-copy & autoconverge
> appear upto the job.
> 
> The MT compression is seriously harmful to migration and has severe CPU
> overhead. The XBZRLE compression is moderatly harmful to migration and
> has potentilly severa memory overhead for large cache sizes to make it
> useful.
> 
> While auto-converge can ensure that guest migration completes, it has a
> pretty significantly long term impact on guest CPU performance to achieve
> this. ie the guest spends a long time in pre-copy mode with its CPUs very
> dramatically throttled down. The level of throttling required makes one
> wonder whether it is worth using, against simply pausing the guest workload.
> The latter has a hard blackout period, but over a quite short time frame if
> network speed is fast.
> 
> The post-copy code does have an impact on guest performance while in pre
> copy mode, vs a plain migration. It also has a fairly high spike when in post-
> copy mode, but this last for a pretty short time. As compared to auto-
> converge, it is able to ensure the migration completes in a finite time 
> without
> having a prolonged impact on guest CPU performance. The penalty during
> the post-copy phase is on a par with the penalty impose by auto-converge
> when it has to throttle to 90%+.
> 
> 
> Overall, in the context of a worst case guest workload, it appears that post-
> copy is the clear winning strategy ensuring completion of migration without
> imposing an long duration penalty on guest peformance. If the risk of failure
> from post-copy is unacceptable then auto-converge is a good fallback option,
> if the long duration guest CPU penalty can be accepted.
> 
> The compression options are only worth using if the host has free CPU
> resources, and the guest RAM is believed to be compression friendly, as they
> steal significant CPU time away from guests in order to run compression,
> often with a negative impact on migration completion chances.
> 

MT compression should only be used when the network bandwidth is the bottle neck
that effects live migration. Use other faster (de)compression algorithm can 
reduce the CPU overhead.

Liang

> Looking at migration in general, even with a 10-GiG-E NIC and RDMA
> transport it is possible for a single guest to provide a workload that will
> saturate the network during migration & thus prevent completion.
> Based on this, there is little point in attempting to run migrations in 
> parallel
> on the same host, unless multiple NICs are available, as parallel migrations
> would reduce the chances of either one ever completing. Better reliability &
> faster overall completion would likely be achieved by fully serializing
> migration operations per host.
> 
> There is clearly scope for more investigation here, in particular
> 
>  - Produce some alternative guest workloads that try to present
>    a more "average" scenario workload, instead of the worst-case.
>    These would likely allow compression to have some positive
>    impact.
> 
>  - Try various combinations of strategies. For example, combining
>    post-copy and auto-converge at the same time, or compression
>    combined with either post-copy or auto-converge.
> 
>  - Investigate block migration performance too, with NBD migration
>    server.
> 
>  - Investigate effect of dynamically changing max downtime value
>    during migration, rather than using a fixed 500ms value.
> 
> 
> Daniel P. Berrange (6):
>   scripts: add __init__.py file to scripts/qmp/
>   scripts: add a 'debug' parameter to QEMUMonitorProtocol
>   scripts: refactor the VM class in iotests for reuse
>   scripts: set timeout when waiting for qemu monitor connection
>   scripts: ensure monitor socket has SO_REUSEADDR set
>   tests: introduce a framework for testing migration performance
> 
>  configure                               |   2 +
>  scripts/qemu.py                         | 202 +++++++++++
>  scripts/qmp/__init__.py                 |   0
>  scripts/qmp/qmp.py                      |  15 +-
>  scripts/qtest.py                        |  34 ++
>  tests/Makefile                          |  12 +
>  tests/migration/.gitignore              |   2 +
>  tests/migration/guestperf-batch.py      |  26 ++
>  tests/migration/guestperf-plot.py       |  26 ++
>  tests/migration/guestperf.py            |  27 ++
>  tests/migration/guestperf/__init__.py   |   0
>  tests/migration/guestperf/comparison.py | 124 +++++++
>  tests/migration/guestperf/engine.py     | 439 ++++++++++++++++++++++
>  tests/migration/guestperf/hardware.py   |  62 ++++
>  tests/migration/guestperf/plot.py       | 623
> ++++++++++++++++++++++++++++++++
>  tests/migration/guestperf/progress.py   | 117 ++++++
>  tests/migration/guestperf/report.py     |  98 +++++
>  tests/migration/guestperf/scenario.py   |  95 +++++
>  tests/migration/guestperf/shell.py      | 255 +++++++++++++
>  tests/migration/guestperf/timings.py    |  55 +++
>  tests/migration/stress.c                | 367 +++++++++++++++++++
>  tests/qemu-iotests/iotests.py           | 135 +------
>  22 files changed, 2583 insertions(+), 133 deletions(-)  create mode 100644
> scripts/qemu.py  create mode 100644 scripts/qmp/__init__.py  create mode
> 100644 tests/migration/.gitignore  create mode 100755
> tests/migration/guestperf-batch.py
>  create mode 100755 tests/migration/guestperf-plot.py  create mode 100755
> tests/migration/guestperf.py  create mode 100644
> tests/migration/guestperf/__init__.py
>  create mode 100644 tests/migration/guestperf/comparison.py
>  create mode 100644 tests/migration/guestperf/engine.py
>  create mode 100644 tests/migration/guestperf/hardware.py
>  create mode 100644 tests/migration/guestperf/plot.py  create mode 100644
> tests/migration/guestperf/progress.py
>  create mode 100644 tests/migration/guestperf/report.py
>  create mode 100644 tests/migration/guestperf/scenario.py
>  create mode 100644 tests/migration/guestperf/shell.py
>  create mode 100644 tests/migration/guestperf/timings.py
>  create mode 100644 tests/migration/stress.c
> 
> --
> 2.5.5
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/, (continued)
- [Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 2/6] scripts: add a 'debug' parameter to QEMUMonitorProtocol, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 3/6] scripts: refactor the VM class in iotests for reuse, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 4/6] scripts: set timeout when waiting for qemu monitor connection, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set, Daniel P. Berrange, 2016/05/05
  - Re: [Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set, Amit Shah, 2016/05/23
- [Qemu-devel] [PATCH v1 6/6] tests: introduce a framework for testing migration performance, Daniel P. Berrange, 2016/05/05
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Dr. David Alan Gilbert, 2016/05/05
  - Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Daniel P. Berrange, 2016/05/09
    - Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Dr. David Alan Gilbert, 2016/05/09
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Li, Liang Z <=
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Amit Shah, 2016/05/06

Prev by Date: Re: [Qemu-devel] [PATCH] rbd:change error_setg() to error_setg_errno()
Next by Date: Re: [Qemu-devel] [patch v6 11/12] vfio: register aer resume notification handler for aer resume
Previous by thread: Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework
Next by thread: Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework
Index(es):
- Date
- Thread