qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL 0/8] Monitor patches for 2018-10-30


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PULL 0/8] Monitor patches for 2018-10-30
Date: Mon, 19 Nov 2018 19:08:13 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Peter Xu <address@hidden> writes:

> On Mon, Nov 19, 2018 at 02:17:27PM +0800, Peter Xu wrote:
>> I reproduced the error with a FreeBSD guest and this change (which
>> possibly can be squashed into "tests: qmp-test: add queue full test")
>> worked for me:
>> 
>> diff --git a/tests/qmp-test.c b/tests/qmp-test.c
>> index 6989acbca4..83f353db4f 100644
>> --- a/tests/qmp-test.c
>> +++ b/tests/qmp-test.c
>> @@ -281,8 +281,15 @@ static void test_qmp_oob(void)
>>       * will only be able to be handled after the queue is shrinked, so
>>       * it'll be processed only after one existing in-band command
>>       * finishes.
>> +     *
>> +     * NOTE: we need to feed the queue with one extra request to make
>> +     * sure it'll be stuck since when we have sent the Nth request
>> +     * it's possible that we have already popped and processing the
>> +     * 1st request so the Nth request (which could potentially be the
>> +     * [N-1]th element on the queue) might not trigger the
>> +     * monitor-full condition deterministically.
>>       */
>> -    for (i = 1; i <= QMP_REQ_QUEUE_LEN_MAX; i++) {
>> +    for (i = 1; i <= QMP_REQ_QUEUE_LEN_MAX + 1; i++) {
>>          id = g_strdup_printf("queue-blocks-%d", i);
>>          send_cmd_that_blocks(qts, id);
>>          g_free(id);
>> @@ -291,7 +298,7 @@ static void test_qmp_oob(void)
>>      unblock_blocked_cmd();
>>      recv_cmd_id(qts, "queue-blocks-1");
>>      recv_cmd_id(qts, "oob-1");
>> -    for (i = 2; i <= QMP_REQ_QUEUE_LEN_MAX; i++) {
>> +    for (i = 2; i <= QMP_REQ_QUEUE_LEN_MAX + 1; i++) {
>>          unblock_blocked_cmd();
>>          id = g_strdup_printf("queue-blocks-%d", i);
>>          recv_cmd_id(qts, id);
>> 
>> So the problem here is that the queue-block-N command might not really
>> suspend the monitor everytime if we already popped the 1st request,
>> which will let the N-th request to be (N-1)th, then the parser will
>> continue to eat the oob command and it could "preempt" the previous
>> commands.
>> 
>> Maybe FreeBSD is scheduling the threads in some pattern so it happens
>> only on FreeBSD and very constantly, but anyway it should be a general
>> fix to the test program.

I feel particularly dense right now, and need a more detailed analysis
to understand.  Let me try.

Here's what the test does before your fix

    Send QMP_REQ_QUEUE_LEN_MAX "blocking" in-band commands
    queue-blocks-1, queue-blocks-2, ...  "Blocking" means the command
    blocks until we explicitly unblock it.

    Send an out-of-band command oob-1.

    Unblock queue-blocks-1.

    Receive response to queue-blocks-1.

    Receive response to oob-1.

    Unblock queue-blocks-2, receive response
    ...

Here's what test test expects QEMU to do:

    In I/O thread:

        Receive and queue queue-blocks-1, ...

        Queue is full, stop receiving commands.

    In main thread:

        Dequeue queue-blocks-1.

        Execute queue-blocks-1, block for a while, finish and send
        response.

    In I/O thread:

        Queue is no longer full, resume receiving commands.

        Receive and execute oob-1, send response

    In main thread:

        Dequeue queue-blocks-2, execute, block for a while, finish and
        send response.
        ...

No good because the two threads race.  On my Linux box the test passes.
Let's have a closer look how it does that.  The appended patch that adds
another tracepoint.  With tracepoints monitor_* and handle_qmp_command
enabled, I get:

    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-1", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}

I/O thread receives and queues queue-blocks-1.

    address@hidden:monitor_qmp_cmd_in_band queue-blocks-1

Main thread dequeues qemu-blocks-1 and starts executing it.

We already lost: the queue won't actually fill up.  The test passes
anyway, but it doesn't actually test "queue full".  But why does it
pass?  Let's see.

    address@hidden:monitor_qmp_respond queue-blocks-1

Main thread sends response to queue-blocks-1.  This means queue-blocks-1
has been unblocked already.  This is possible, because qmp-test sends
all the commands and then immediately unblocks queue-blocks-1, and the
main thread can win the race with the I/O thread.

    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-2", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-2
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-3", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-4", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-5", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-6", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-7", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}
    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-8", "driver": "blkdebug", "config": "/tmp/qmp-test-EZroW3/fifo"}}

I/O thread receives and queues queue-blocks-2, ... queue-blocks-8.

    address@hidden:handle_qmp_command mon 0x563e19be4e80 req: {"exec-oob": 
"migrate-pause"}
    address@hidden:monitor_qmp_cmd_out_of_band oob-1
    address@hidden:monitor_qmp_respond oob-1

I/O thread receives, executes and responds to oob-1.

    address@hidden:monitor_qmp_respond queue-blocks-2
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-3
    address@hidden:monitor_qmp_respond queue-blocks-3
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-4
    address@hidden:monitor_qmp_respond queue-blocks-4
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-5
    address@hidden:monitor_qmp_respond queue-blocks-5
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-6
    address@hidden:monitor_qmp_respond queue-blocks-6
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-7
    address@hidden:monitor_qmp_respond queue-blocks-7
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-8
    address@hidden:monitor_qmp_respond queue-blocks-8

Main thread dequeues, executes and responds to queue-blocks-2,
... queue-blocks-8.

If I make the main thread lose the race by sticking a sleep(1) before
the unblock_blocked_cmd() that unblocks queue-blocks-1, I get:

    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-1", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}

I/O thread receives and queues queue-blocks-1.

    address@hidden:monitor_qmp_cmd_in_band queue-blocks-1

Main thread dequeues qemu-blocks-1 and starts executing it.  We already
lost.

    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-2", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-3", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-4", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-5", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-6", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-7", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}
    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-8", "driver": "blkdebug", "config": "/tmp/qmp-test-PqO2uz/fifo"}}

I/O thread receives and queues queue-blocks-2, ... queue-blocks-8.

    address@hidden:handle_qmp_command mon 0x55f5b6a5de80 req: {"exec-oob": 
"migrate-pause"}
    address@hidden:monitor_qmp_cmd_out_of_band oob-1
    address@hidden:monitor_qmp_respond oob-1

I/O thread receives, executes and responds to oob-1.

Test is going to fail: response to oob-1 sent before response to
queue-blocks-1.

Now let me try your fix.  Here's what I see:

    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-1", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-1
    address@hidden:monitor_qmp_respond queue-blocks-1
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-2", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-2
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-3", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-4", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-5", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-6", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-7", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-8", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"execute": 
"blockdev-add", "arguments": {"image": {"driver": "null-co"}, "node-name": 
"queue-blocks-9", "driver": "blkdebug", "config": "/tmp/qmp-test-8YRbOJ/fifo"}}
    address@hidden:handle_qmp_command mon 0x5597b8e0de80 req: {"exec-oob": 
"migrate-pause"}
    address@hidden:monitor_qmp_cmd_out_of_band oob-1
    address@hidden:monitor_qmp_respond oob-1
    address@hidden:monitor_qmp_respond queue-blocks-2
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-3
    address@hidden:monitor_qmp_respond queue-blocks-3
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-4
    address@hidden:monitor_qmp_respond queue-blocks-4
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-5
    address@hidden:monitor_qmp_respond queue-blocks-5
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-6
    address@hidden:monitor_qmp_respond queue-blocks-6
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-7
    address@hidden:monitor_qmp_respond queue-blocks-7
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-8
    address@hidden:monitor_qmp_respond queue-blocks-8
    address@hidden:monitor_qmp_cmd_in_band queue-blocks-9
    address@hidden:monitor_qmp_respond queue-blocks-9

Tracepoint monitor_suspend still not hit.  This means the queue does not
fill up.

> Markus, do you want me to repost a new version with this change?  Is
> it still possible to have the oob-default series for 3.1?

We're trying to, but we need to get the test to work.

Two problems:

1. The test doesn't seem to succed at testing "queue full".

2. The test might still be racy.

I suspect the code we test is solid, and it's "only" the test we screwed
up.


diff --git a/monitor.c b/monitor.c
index 431795d686..52112889ef 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4064,6 +4064,7 @@ static int monitor_can_read(void *opaque)
  */
 static void monitor_qmp_respond(Monitor *mon, QDict *rsp, QObject *id)
 {
+    trace_monitor_qmp_respond(qobject_get_try_str(id) ?: "");
     if (rsp) {
         if (id) {
             qdict_put_obj(rsp, "id", qobject_ref(id));
diff --git a/trace-events b/trace-events
index 4fd2cb4b97..57c5f6cc94 100644
--- a/trace-events
+++ b/trace-events
@@ -50,6 +50,7 @@ handle_qmp_command(void *mon, const char *req) "mon %p req: 
%s"
 monitor_suspend(void *ptr, int cnt) "mon %p: %d"
 monitor_qmp_cmd_in_band(const char *id) "%s"
 monitor_qmp_cmd_out_of_band(const char *id) "%s"
+monitor_qmp_respond(const char *id) "%s"
 
 # dma-helpers.c
 dma_blk_io(void *dbs, void *bs, int64_t offset, bool to_dev) "dbs=%p bs=%p 
offset=%" PRId64 " to_dev=%d"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]