qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block: add watermark event


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH] block: add watermark event
Date: Tue, 08 Jul 2014 09:10:57 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

On 07/08/2014 08:49 AM, Francesco Romani wrote:
> Managing applications, like oVirt (http://www.ovirt.org), make extensive
> use of thin-provisioned disk images.
> In order to let the guest run flawlessly and be not unnecessarily
> paused, oVirt sets a watermark based on the percentage occupation of the
> device against the advertised size, and automatically resizes the image
> once the watermark is reached or exceeded.
> 
> In order to detect the mark crossing, the managing application has no
> choice than aggressively polling the QEMU monitor using the
> query-blockstats command. This lead to unnecessary system
> load, and is made even worse under scale: scenarios
> with hundreds of VM are becoming not unusual.
> 
> To fix this, this patch adds:
> * A new monitor command to set a mark for a given block device.
> * A new event to report if a block device usage exceeds the threshold.
> 
> This will allow the managing application to drop the polling
> alltogether and just wait for a watermark crossing event.

s/alltogether/altogether/

> 
> Signed-off-by: Francesco Romani <address@hidden>
> ---
>  block.c                   | 56 
> +++++++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                | 21 ++++++++++++++++++
>  include/block/block.h     |  2 ++
>  include/block/block_int.h |  3 +++
>  qapi/block-core.json      | 33 ++++++++++++++++++++++++++++
>  qmp-commands.hx           | 24 ++++++++++++++++++++
>  6 files changed, 139 insertions(+)
> 


> +void bdrv_set_watermark_perc(BlockDriverState *bs,
> +                             int watermark_perc)
> +{
> +    NotifierWithReturn before_write = {
> +        .notify = watermark_before_write_notify,
> +    };
> +
> +    if (watermark_perc <= 0) {
> +        return;
> +    }

Shouldn't you uninstall the write_notifier if someone is changing the
watermark_perc from non-zero back to zero?


> +++ b/include/block/block_int.h
> @@ -393,6 +393,9 @@ struct BlockDriverState {
>  
>      /* The error object in use for blocking operations on backing_hd */
>      Error *backing_blocker;
> +
> +    /* watermark limit for writes, percentage of sectors */
> +    int wr_watermark_perc;

I didn't see any code that ensures that this limit is between 0 and 100;
since it is under user control, your code probably misbehaves for values
such as 101.

> +
> +##
> +# @BLOCK_WATERMARK
> +#
> +# Emitted when a block device reaches or exceeds the watermark limit.
> +#
> +# @device: device name
> +#
> +# @sector-num: number of the sector exceeding the threshold
> +#
> +# @sector-highest: number of the last highest written sector
> +#
> +# Since: 2.1

2.2. You've missed hard freeze for 2.1.

> +##
> +{ 'event': 'BLOCK_WATERMARK',
> +  'data': { 'device': 'str', 'sector-num': 'int', 'sector-highest': 'int' } }
> +
> +##
> +# @block_set_watermark

s/_/-/2 - new commands should use - in their name.

> +#
> +# Change watermark percentage for a block drive.
> +#
> +# @device: The name of the device
> +#
> +# @watermark: high water mark, percentage

Is percentage the right unit?  On a 10G disk, that means you only have a
granularity down to 100M.  Should the watermark limit instead be
expressed as a byte value instead of a percentage?

> +#
> +# Returns: Nothing on success
> +#          If @device is not a valid block device, DeviceNotFound
> +#
> +# Since: 2.1

2.2.

> +##
> +{ 'command': 'block_set_watermark',
> +  'data': { 'device': 'str', 'watermark': 'int' } }

I hate write-only commands.  Which query-* command are you modifying to
output the current watermark level?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]