qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] migration/calc-dirty-rate: millisecond-granularity period


From: Yong Huang
Subject: Re: [PATCH v2] migration/calc-dirty-rate: millisecond-granularity period
Date: Sun, 6 Aug 2023 14:31:43 +0800



On Sat, Aug 5, 2023 at 2:05 AM Markus Armbruster <armbru@redhat.com> wrote:
Andrei Gudkov <gudkov.andrei@huawei.com> writes:

> Introduces alternative argument calc-time-ms, which is the
> the same as calc-time but accepts millisecond value.
> Millisecond granularity allows to make predictions whether
> migration will succeed or not. To do this, calculate dirty
> rate with calc-time-ms set to max allowed downtime, convert
> measured rate into volume of dirtied memory, and divide by
> network throughput. If the value is lower than max allowed
> downtime, then migration will converge.
>
> Measurement results for single thread randomly writing to
> a 1/4/24GiB memory region:
>
> +--------------+-----------------------------------------------+
> | calc-time-ms |                dirty rate MiB/s               |
> |              +----------------+---------------+--------------+
> |              | theoretical    | page-sampling | dirty-bitmap |
> |              | (at 3M wr/sec) |               |              |
> +--------------+----------------+---------------+--------------+
> |                             1GiB                             |
> +--------------+----------------+---------------+--------------+
> |          100 |           6996 |          7100 |         3192 |
> |          200 |           4606 |          4660 |         2655 |
> |          300 |           3305 |          3280 |         2371 |
> |          400 |           2534 |          2525 |         2154 |
> |          500 |           2041 |          2044 |         1871 |
> |          750 |           1365 |          1341 |         1358 |
> |         1000 |           1024 |          1052 |         1025 |
> |         1500 |            683 |           678 |          684 |
> |         2000 |            512 |           507 |          513 |
> +--------------+----------------+---------------+--------------+
> |                             4GiB                             |
> +--------------+----------------+---------------+--------------+
> |          100 |          10232 |          8880 |         4070 |
> |          200 |           8954 |          8049 |         3195 |
> |          300 |           7889 |          7193 |         2881 |
> |          400 |           6996 |          6530 |         2700 |
> |          500 |           6245 |          5772 |         2312 |
> |          750 |           4829 |          4586 |         2465 |
> |         1000 |           3865 |          3780 |         2178 |
> |         1500 |           2694 |          2633 |         2004 |
> |         2000 |           2041 |          2031 |         1789 |
> +--------------+----------------+---------------+--------------+
> |                             24GiB                            |
> +--------------+----------------+---------------+--------------+
> |          100 |          11495 |          8640 |         5597 |
> |          200 |          11226 |          8616 |         3527 |
> |          300 |          10965 |          8386 |         2355 |
> |          400 |          10713 |          8370 |         2179 |
> |          500 |          10469 |          8196 |         2098 |
> |          750 |           9890 |          7885 |         2556 |
> |         1000 |           9354 |          7506 |         2084 |
> |         1500 |           8397 |          6944 |         2075 |
> |         2000 |           7574 |          6402 |         2062 |
> +--------------+----------------+---------------+--------------+
>
> Theoretical values are computed according to the following formula:
> size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20),
> where size is in bytes, time is in seconds, and wps is number of
> writes per second.
>
> Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
> ---
>  qapi/migration.json   | 14 ++++++--
>  migration/dirtyrate.h | 12 ++++---
>  migration/dirtyrate.c | 81 +++++++++++++++++++++++++------------------
>  3 files changed, 67 insertions(+), 40 deletions(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 8843e74b59..82493d6a57 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1849,7 +1849,11 @@
>  # @start-time: start time in units of second for calculation
>  #
>  # @calc-time: time period for which dirty page rate was measured
> -#     (in seconds)
> +#     (rounded down to seconds).
> +#
> +# @calc-time-ms: actual time period for which dirty page rate was
> +#     measured (in milliseconds).  Value may be larger than requested
> +#     time period due to measurement overhead.
>  #
>  # @sample-pages: number of sampled pages per GiB of guest memory.
>  #     Valid only in page-sampling mode (Since 6.1)
> @@ -1866,6 +1870,7 @@
>             'status': 'DirtyRateStatus',
>             'start-time': 'int64',
>             'calc-time': 'int64',
> +           'calc-time-ms': 'int64',
>             'sample-pages': 'uint64',
>             'mode': 'DirtyRateMeasureMode',
>             '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } }
> @@ -1908,6 +1913,10 @@
>  #     dirty during @calc-time period, further writes to this page will
>  #     not increase dirty page rate anymore.
>  #
> +# @calc-time-ms: the same as @calc-time but in milliseconds.  These
> +#    two arguments are mutually exclusive.  Exactly one of them must
> +#    be specified. (Since 8.1)
> +#
>  # @sample-pages: number of sampled pages per each GiB of guest memory.
>  #     Default value is 512.  For 4KiB guest pages this corresponds to
>  #     sampling ratio of 0.2%.  This argument is used only in page
> @@ -1925,7 +1934,8 @@
>  #                                                 'sample-pages': 512} }
>  # <- { "return": {} }
>  ##
> -{ 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64',
> +{ 'command': 'calc-dirty-rate', 'data': {'*calc-time': 'int64',
> +                                         '*calc-time-ms': 'int64',
>                                           '*sample-pages': 'int',
>                                           '*mode': 'DirtyRateMeasureMode'} }


Having both @calc-time and @calc-time-ms is ugly.

Can we deprecate @calc-time?
Since the upper app Libvirt has used this field to implement
the virDomainStartDirtyRateCalc API unfortunately.
Deprecating this requires the extra patch on Libvirt but no
functional improvement, IMHO, the field could remain untouched.

I don't like the name @calc-time-ms.  We don't put units in names
elsewhere.

Differently ugly: new member containing the fractional part, i.e. time
in seconds = calc-time + fractional-part / 1000.  With a better name, of
course.

[...]


Thanks,

Yong
--
Best regards

reply via email to

[Prev in Thread] Current Thread [Next in Thread]