qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] block: add logging facility for long standing IO request


From: Denis V. Lunev
Subject: Re: [PATCH 2/3] block: add logging facility for long standing IO requests
Date: Thu, 17 Sep 2020 18:27:11 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 9/17/20 4:56 PM, Max Reitz wrote:
> On 10.08.20 12:14, Denis V. Lunev wrote:
>> There are severe delays with IO requests processing if QEMU is running in
>> virtual machine or over software defined storage. Such delays potentially
>> results in unpredictable guest behavior. For example, guests over IDE or
>> SATA drive could remount filesystem read-only if write is performed
>> longer than 10 seconds.
>>
>> Such reports are very complex to process. Some good starting point for this
>> seems quite reasonable. This patch provides one. It adds logging of such
>> potentially dangerous long IO operations.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/accounting.c         | 59 +++++++++++++++++++++++++++++++++++++-
>>  blockdev.c                 |  7 ++++-
>>  include/block/accounting.h |  5 +++-
>>  3 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/accounting.c b/block/accounting.c
>> index 8d41c8a83a..24f48c84b8 100644
>> --- a/block/accounting.c
>> +++ b/block/accounting.c
> [...]
>
>> @@ -182,6 +185,58 @@ void block_latency_histograms_clear(BlockAcctStats 
>> *stats)
>>      }
>>  }
>>  
>> +static const char *block_account_type(enum BlockAcctType type)
>> +{
>> +    switch (type) {
>> +    case BLOCK_ACCT_READ:
>> +        return "READ";
>> +    case BLOCK_ACCT_WRITE:
>> +        return "WRITE";
>> +    case BLOCK_ACCT_FLUSH:
>> +        return "DISCARD";
>> +    case BLOCK_ACCT_UNMAP:
>> +        return "TRUNCATE";
> I don’t understand how FLUSH translates to DISCARD, and UNMAP to TRUNCATE.
>

Ha, seems I have made copy/pasted and missed this.
Nice catch.

>> +    case BLOCK_ACCT_NONE:
>> +        return "NONE";
>> +    case BLOCK_MAX_IOTYPE:
>> +        break;
>> +    }
>> +    return "UNKNOWN";
>> +}
>> +
>> +static void block_acct_report_long(BlockAcctStats *stats,
>> +                                   BlockAcctCookie *cookie, int64_t 
>> latency_ns)
>> +{
>> +    unsigned latency_ms = latency_ns / SCALE_MS;
>> +    int64_t start_time_host_s;
>> +    char buf[64];
>> +    struct tm t;
>> +    BlockDriverState *bs;
>> +
>> +    if (cookie->type == BLOCK_ACCT_NONE) {
>> +        return;
>> +    }
>> +    if (stats->latency_log_threshold_ms == 0) {
>> +        return;
>> +    }
>> +    if (latency_ms < stats->latency_log_threshold_ms) {
>> +        return;
>> +    }
>> +
>> +    start_time_host_s = cookie->start_time_ns / NANOSECONDS_PER_SECOND;
>> +    strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",
>> +             localtime_r(&start_time_host_s, &t));
>> +
>> +    bs = blk_bs(blk_stats2blk(stats));
>> +    qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
>> %s)\n",
>> +             block_account_type(cookie->type), cookie->bytes,
>> +             (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
> Why not just latency_ms * .001f and %.3f?
I personally dislike floating point, this leads to
a bit strange output.


>
>> +             (int)((cookie->start_time_ns / 1000000) % 1000),
> s/1000000/SCALE_MS/? (But I’m not sure whether that’s what the SCALE_?S
> are for.)
yes, you right.

>> +             bs == NULL ? "unknown" : bdrv_get_node_name(bs),
>> +             bs == NULL ? "unknown" : bdrv_get_format_name(bs),
>> +             bs == NULL ? "unknown" : bs->filename);
> Now that I’m writing this response already, I wonder whether a QMP event
> wouldn’t be nice.  (But considering that accounting apparently just
> doesn’t with -blockdev, I suppose that’s not that big of a worry.)
may be. Why not?

Though there is some note. Usually after the stall there
is a bunch of requests completed at the same time
once the storage is unfrozen, thus some aggregation
would be required.

I think that this should be made separately.
>> +}
>> +
>>  static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie 
>> *cookie,
>>                                   bool failed)
>>  {
> [...]
>
>> diff --git a/blockdev.c b/blockdev.c
>> index 3848a9c8ab..66158d1292 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -622,7 +622,8 @@ static BlockBackend *blockdev_init(const char *file, 
>> QDict *bs_opts,
>>  
>>          bs->detect_zeroes = detect_zeroes;
>>  
>> -        block_acct_setup(blk_get_stats(blk), account_invalid, 
>> account_failed);
>> +        block_acct_setup(blk_get_stats(blk), account_invalid, 
>> account_failed,
>> +                qemu_opt_get_number(opts, "latency-log-threshold", 0));
> latency_log_threshold_ms is an unsigned int and so this will silently
> overflow for values >= 2^32.
>
> (Or for user-specified values < 0, which are wrapped around.)
>
>>  
>>          if (!parse_stats_intervals(blk_get_stats(blk), interval_list, 
>> errp)) {
>>              blk_unref(blk);
>> @@ -3727,6 +3728,10 @@ QemuOptsList qemu_common_drive_opts = {
>>              .type = QEMU_OPT_BOOL,
>>              .help = "whether to account for failed I/O operations "
>>                      "in the statistics",
>> +        },{
>> +            .name = "latency-log-threshold",
>> +            .type = QEMU_OPT_NUMBER,
>> +            .help = "threshold for long I/O report (disabled if <=0), in 
>> ms",
> Because of that overflow, negative values will not necessarily disable
> reporting, because truncating them to 32 bit may make them small
> absolute values again.
>
> In any case, I’d just say “disabled if 0”, and not mention anything
> about <0.
will do.

Thanks for review!

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]