[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framewor
From: |
Li, Liang Z |
Subject: |
Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework |
Date: |
Fri, 6 May 2016 01:33:14 +0000 |
> This series of patches provides a framework for testing migration
> performance characteristics. The motivating factor for this is planning that
> is
> underway in OpenStack wrt making use of QEMU migration features such as
> compression, auto-converge and post-copy. The primary aim for OpenStack
> is to have Nova autonomously manage migration features & tunables to
> maximise chances that migration will complete. The problem faced is figuring
> out just which QEMU migration features are "best" suited to our needs. This
> means we want data on how well they are able to ensure completion of a
> migration, against the host resources used and the impact on the guest
> workload performance.
>
> The test framework produced here takes a pathelogical guest workload
> (every CPU just burning 100% of time xor'ing every byte of guest memory
> with random data). This is quite a pessimistic test because most guest
> workloads are not giong to be this heavy on memory writes, and their data
> won't be uniformly random and so will be able to compress better than this
> test does.
>
Wonderful test report!
> With this worst case guest, I have produced a set of tests using UNIX socket,
> TCP localhost, TCP remote and RDMA remote socket transports, with both a
> 1 GB RAM + 1 CPU guest and a 8 GB RAM + 4 CPU guest.
>
> The TCP/RDMA remote host tests were run over a 10-GiG-E network
> interface.
>
> I have put the results online to view here:
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/
>
> The charts here are showing two core sets of data:
>
> - The guest CPU performance. The left axis is showing the time in
> milliseconds
> required to xor 1 GB of memory. This is shown per-guest CPU and
> combined all
> CPUs.
>
> - The host CPU utilization. The right axis is showing the overall QEMU
> process
> CPU utilization, and the per-VCPU utilization.
>
> Note that the charts are interactive - you can turn on/off each plot line and
> zoom in by selecting regions on the chart.
>
>
> Some interesting things that I have observed with this
>
> - At the start of each iteration of migration there is a distinct drop in
> guest CPU performance as shown by a spike in the guest CPU time lines.
> Performance would drop from 200ms/GB to 400ms/GB. Presumably this is
> related to QEMU recalculating the dirty bitmap for the guest RAM. See
> the spikes in the green line in:
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/post-copy-bandwidth/post-copy-bw-1gbs.html
>
> - For the larger sized guests, the auto-converge code has to throttle the
> guest to as much as 90% or more before it is able to meet the 500ms max
> downtime value
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/auto-converge-bandwidth/auto-converge-bw-1gbs.html
>
> Even then I often saw tests aborting as they hit the max number of
> iterations I permitted (30 iters max)
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-8gb-4cpu/auto-converge-bandwidth/auto-converge-bw-10gbs.html
>
> - MT compression is actively harmful to chances of successful migration
> when
> the guest RAM is not compression friendly. My work load is worst case
> since
> it is splattering RAM with totally random bytes. The MT compression is
> dramatically increasing the time for each iteration as we bottleneck on CPU
> compression speed, leaving the network largely idle. This causes migration
> which would have completed without compression, to fail. It also burns
> huge
> amounts of host CPU time
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-mt/compr-mt-threads-4.html
>
> - XBZRLE compression did not have as much of a CPU peformance penalty on
> the
> host as MT comprssion, but also did not help migration to actually
> complete.
> Again this is largely due to the workload being the worst case scenario
> with
> random bytes. The downside is obviously the potentially significant
> memory
> overhead on the host due to the cache sizing
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-1gb-1cpu/compr-xbzrle/compr-xbzrle-cache-50.html
>
>
> - Post-copy, by its very nature, obviously ensured that the migraton would
> complete. While post-copy was running in pre-copy mode there was a
> somewhat
> chaotic small impact on guest CPU performance, causing performance to
> periodically oscillate between 400ms/GB and 800ms/GB. This is less than
> the impact at the start of each migration iteration which was 1000ms/GB
> in this test. There was also a massive penalty at time of switchover from
> pre to post copy, as to be expected. The migration completed in post-copy
> phase quite quickly though. For this workload, number of iterations in
> pre-copy mode before switching to post-copy did not have much impact. I
> expect a less extreme workload would have shown more interesting results
> wrt number of iterations of pre-copy:
>
> https://berrange.fedorapeople.org/qemu-mig-test-2016-05-05/tcp-
> remote-8gb-4cpu/post-copy-iters.html
>
>
> Overall, if we're looking for a solution that can guarantee completion under
> the most extreme guest workload, then only post-copy & autoconverge
> appear upto the job.
>
> The MT compression is seriously harmful to migration and has severe CPU
> overhead. The XBZRLE compression is moderatly harmful to migration and
> has potentilly severa memory overhead for large cache sizes to make it
> useful.
>
> While auto-converge can ensure that guest migration completes, it has a
> pretty significantly long term impact on guest CPU performance to achieve
> this. ie the guest spends a long time in pre-copy mode with its CPUs very
> dramatically throttled down. The level of throttling required makes one
> wonder whether it is worth using, against simply pausing the guest workload.
> The latter has a hard blackout period, but over a quite short time frame if
> network speed is fast.
>
> The post-copy code does have an impact on guest performance while in pre
> copy mode, vs a plain migration. It also has a fairly high spike when in post-
> copy mode, but this last for a pretty short time. As compared to auto-
> converge, it is able to ensure the migration completes in a finite time
> without
> having a prolonged impact on guest CPU performance. The penalty during
> the post-copy phase is on a par with the penalty impose by auto-converge
> when it has to throttle to 90%+.
>
>
> Overall, in the context of a worst case guest workload, it appears that post-
> copy is the clear winning strategy ensuring completion of migration without
> imposing an long duration penalty on guest peformance. If the risk of failure
> from post-copy is unacceptable then auto-converge is a good fallback option,
> if the long duration guest CPU penalty can be accepted.
>
> The compression options are only worth using if the host has free CPU
> resources, and the guest RAM is believed to be compression friendly, as they
> steal significant CPU time away from guests in order to run compression,
> often with a negative impact on migration completion chances.
>
MT compression should only be used when the network bandwidth is the bottle neck
that effects live migration. Use other faster (de)compression algorithm can
reduce the CPU overhead.
Liang
> Looking at migration in general, even with a 10-GiG-E NIC and RDMA
> transport it is possible for a single guest to provide a workload that will
> saturate the network during migration & thus prevent completion.
> Based on this, there is little point in attempting to run migrations in
> parallel
> on the same host, unless multiple NICs are available, as parallel migrations
> would reduce the chances of either one ever completing. Better reliability &
> faster overall completion would likely be achieved by fully serializing
> migration operations per host.
>
> There is clearly scope for more investigation here, in particular
>
> - Produce some alternative guest workloads that try to present
> a more "average" scenario workload, instead of the worst-case.
> These would likely allow compression to have some positive
> impact.
>
> - Try various combinations of strategies. For example, combining
> post-copy and auto-converge at the same time, or compression
> combined with either post-copy or auto-converge.
>
> - Investigate block migration performance too, with NBD migration
> server.
>
> - Investigate effect of dynamically changing max downtime value
> during migration, rather than using a fixed 500ms value.
>
>
> Daniel P. Berrange (6):
> scripts: add __init__.py file to scripts/qmp/
> scripts: add a 'debug' parameter to QEMUMonitorProtocol
> scripts: refactor the VM class in iotests for reuse
> scripts: set timeout when waiting for qemu monitor connection
> scripts: ensure monitor socket has SO_REUSEADDR set
> tests: introduce a framework for testing migration performance
>
> configure | 2 +
> scripts/qemu.py | 202 +++++++++++
> scripts/qmp/__init__.py | 0
> scripts/qmp/qmp.py | 15 +-
> scripts/qtest.py | 34 ++
> tests/Makefile | 12 +
> tests/migration/.gitignore | 2 +
> tests/migration/guestperf-batch.py | 26 ++
> tests/migration/guestperf-plot.py | 26 ++
> tests/migration/guestperf.py | 27 ++
> tests/migration/guestperf/__init__.py | 0
> tests/migration/guestperf/comparison.py | 124 +++++++
> tests/migration/guestperf/engine.py | 439 ++++++++++++++++++++++
> tests/migration/guestperf/hardware.py | 62 ++++
> tests/migration/guestperf/plot.py | 623
> ++++++++++++++++++++++++++++++++
> tests/migration/guestperf/progress.py | 117 ++++++
> tests/migration/guestperf/report.py | 98 +++++
> tests/migration/guestperf/scenario.py | 95 +++++
> tests/migration/guestperf/shell.py | 255 +++++++++++++
> tests/migration/guestperf/timings.py | 55 +++
> tests/migration/stress.c | 367 +++++++++++++++++++
> tests/qemu-iotests/iotests.py | 135 +------
> 22 files changed, 2583 insertions(+), 133 deletions(-) create mode 100644
> scripts/qemu.py create mode 100644 scripts/qmp/__init__.py create mode
> 100644 tests/migration/.gitignore create mode 100755
> tests/migration/guestperf-batch.py
> create mode 100755 tests/migration/guestperf-plot.py create mode 100755
> tests/migration/guestperf.py create mode 100644
> tests/migration/guestperf/__init__.py
> create mode 100644 tests/migration/guestperf/comparison.py
> create mode 100644 tests/migration/guestperf/engine.py
> create mode 100644 tests/migration/guestperf/hardware.py
> create mode 100644 tests/migration/guestperf/plot.py create mode 100644
> tests/migration/guestperf/progress.py
> create mode 100644 tests/migration/guestperf/report.py
> create mode 100644 tests/migration/guestperf/scenario.py
> create mode 100644 tests/migration/guestperf/shell.py
> create mode 100644 tests/migration/guestperf/timings.py
> create mode 100644 tests/migration/stress.c
>
> --
> 2.5.5
>
- [Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/, (continued)
- [Qemu-devel] [PATCH v1 1/6] scripts: add __init__.py file to scripts/qmp/, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 2/6] scripts: add a 'debug' parameter to QEMUMonitorProtocol, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 3/6] scripts: refactor the VM class in iotests for reuse, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 4/6] scripts: set timeout when waiting for qemu monitor connection, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 5/6] scripts: ensure monitor socket has SO_REUSEADDR set, Daniel P. Berrange, 2016/05/05
- [Qemu-devel] [PATCH v1 6/6] tests: introduce a framework for testing migration performance, Daniel P. Berrange, 2016/05/05
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Dr. David Alan Gilbert, 2016/05/05
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework,
Li, Liang Z <=
- Re: [Qemu-devel] [PATCH v1 0/6] A migration performance testing framework, Amit Shah, 2016/05/06