[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Aborts in iotest 169
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [Qemu-devel] Aborts in iotest 169 |
Date: |
Thu, 24 Jan 2019 08:08:45 +0000 |
23.01.2019 21:08, Dr. David Alan Gilbert wrote:
> * Max Reitz (address@hidden) wrote:
>> On 23.01.19 17:35, Dr. David Alan Gilbert wrote:
>>> * Luiz Capitulino (address@hidden) wrote:
>>>> On Wed, 23 Jan 2019 17:12:35 +0100
>>>> Max Reitz <address@hidden> wrote:
>>>>
>>>>> On 23.01.19 17:04, Luiz Capitulino wrote:
>>>>>> On Wed, 23 Jan 2019 16:48:49 +0100
>>>>>> Max Reitz <address@hidden> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> When running 169 in parallel (e.g. like so:
>>>>>>>
>>>>>>> $ while TEST_DIR=/tmp/t0 ./check -T -qcow2 169; do; done
>>>>>>> $ while TEST_DIR=/tmp/t1 ./check -T -qcow2 169; do; done
>>>>>>> $ while TEST_DIR=/tmp/t2 ./check -T -qcow2 169; do; done
>>>>>>> $ while TEST_DIR=/tmp/t3 ./check -T -qcow2 169; do; done
>>>>>>>
>>>>>>> in four different shells), I get aborts:
>>>>>>
>>>>>> OK, is this part of a test-suite that's also running migration
>>>>>> tests in parallel or in sequence? In other words, what does
>>>>>> iotests have to do with migration (sorry if this is stupid
>>>>>> question, but it's been years I don't do qemu).
>>>>>
>>>>> They run migration tests in sequence, but if you run multiple test
>>>>> instances in parallel, well, then they will be run in parallel.
>>>>>
>>>>> The only reason I CC'd you was because you were so prominent in git
>>>>> blame. O:-)
>>>>
>>>> Yeah, that's often the case with me :-)
>>>>
>>>>>> When this happened in the past it meant some QEMU code skipped a
>>>>>> transition, but I can't tell what this has to do with iotests.
>>>>>
>>>>> Well, this iotest (which tests a migration configuration) sometimes
>>>>> apparently results in this invalid transition. But that can't be just
>>>>> the test's fault, as qemu should handle that gracefully.
>>>>
>>>> Does iotest run a guest or does it only executes parts of qemu
>>>> code? If it's the latter, then I'd guess the test code is missing
>>>> calling qemu code that sets the appropriate state between
>>>> running and postmigrate states.
>>>>
>>>>> It's probably an issue in the migration code and not so much in vl.c,
>>>>> yes...
>>>>
>>>> Yeah, I'll let the migration people jump in.
>>>
>>> Can we get a log of the qmp commands when it fails?
>>> A 'running->postmigrate' transition is a bit weird; you could get a
>>> something->postmigrate if you migrate a VM twice, and the source is
>>> already in postmigrate, so it tells the destination to go into
>>> postmigrate. That's odd in itself, but not *that* odd.
>>> But then why is it going running->postmigrate? That suggests
>>> the source is currently running when it receives an incoming migration,
>>> and that really shouldn't happen.
>>
>> This is what I get with -d:
>>
>>>
>>> test_do_test_migration_resume_source_persistent__not_migbitmap
>>> (__main__.TestDirtyBitmapMigration) ... DEBUG:QMP:>>> {'execute':
>>> 'qmp_capabilities'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'migrate-set-capabilities', 'arguments':
>>> {'capabilities': [{'capability': 'events', 'state': True}]}}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'block-dirty-bitmap-add', 'arguments': {'node':
>>> 'drive0', 'persistent': True, 'name': 'bitmap0', 'granularity': 512}}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 0 65536"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 983040 65536"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 655873 4096"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> DEBUG:QMP:<<< {u'return': {u'sha256':
>>> u'01d2ebedcb8f549a2547dbf8e231c410e3e747a9479e98909fc936e0035cf8b1'}}
>>> DEBUG:QMP:>>> {'execute': 'migrate', 'arguments': {'uri': 'exec: cat >
>>> /tmp/t0/test-qcow2-file/mig_file'}}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 301029}, u'data': {u'status': u'setup'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 302836}, u'data': {u'pass': 1}, u'event': u'MIGRATION_PASS'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 302899}, u'data': {u'status': u'active'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 312321}, u'event': u'STOP'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 312413}, u'data': {u'pass': 2}, u'event': u'MIGRATION_PASS'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 314779}, u'data': {u'status': u'completed'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> DEBUG:QMP:<<< {u'error': {u'class': u'GenericError', u'desc': u"Dirty
>>> bitmap 'bitmap0' not found"}}
>>> DEBUG:QMP:>>> {'execute': 'cont'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263213, u'microseconds':
>>> 316624}, u'event': u'RESUME'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> WARNING:qemu:qemu received signal 6:
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -chardev
>>> socket,id=mon,path=/tmp/t0/test-qcow2-file/tmp1d4GTg/qemua-28430-monitor.sock
>>> -mon chardev=mon,mode=control -display none -vga none -qtest
>>> unix:path=/tmp/t0/test-qcow2-file/qemua-28430-qtest.sock -machine
>>> accel=qtest -nodefaults -machine accel=qtest -drive
>>> if=virtio,id=drive0,file=/tmp/t0/test-qcow2-file/disk_a,format=qcow2,cache=writethrough
>>> ERROR
>>>
>>> ======================================================================
>>> ERROR: test_do_test_migration_resume_source_persistent__not_migbitmap
>>> (__main__.TestDirtyBitmapMigration)
>>> ----------------------------------------------------------------------
>>> Traceback (most recent call last):
>>> File "169", line 206, in <lambda>
>>> setattr(klass, 'test_' + method + name, lambda self: mc(self))
>>> File "169", line 113, in do_test_migration_resume_source
>>> self.check_bitmap(self.vm_a, sha256)
>>> File "169", line 72, in check_bitmap
>>> node='drive0', name='bitmap0')
>>> File "tests/qemu-iotests/../../scripts/qemu.py", line 368, in qmp
>>> return self._qmp.cmd(cmd, args=qmp_args)
>>> File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 191, in cmd
>>> return self.cmd_obj(qmp_cmd)
>>> File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 174, in cmd_obj
>>> resp = self.__json_read()
>>> File "tests/qemu-iotests/../../scripts/qmp/qmp.py", line 82, in
>>> __json_read
>>> data = self.__sockfile.readline()
>>> File "/usr/lib64/python2.7/socket.py", line 451, in readline
>>> data = self._sock.recv(self._rbufsize)
>>> error: [Errno 104] Connection reset by peer
>>>
>>> ----------------------------------------------------------------------
>>
>>
>> I'm not sure whether the problem is really from
>> x-debug-block-dirty-bitmap-sha256, or from the self.vm_a.shutdown() on
>> line 116 in iotest 169... Or maybe from something else entirely?
>>
>>
>> A slightly different failure pattern is this:
>>
>>> test_do_test_migration_resume_source_not_persistent__migbitmap
>>> (__main__.TestDirtyBitmapMigration) ... DEBUG:QMP:>>> {'execute':
>>> 'qmp_capabilities'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'migrate-set-capabilities', 'arguments':
>>> {'capabilities': [{'capability': 'events', 'state': True}, {'capability':
>>> 'dirty-bitmaps', 'state': True}]}}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'block-dirty-bitmap-add', 'arguments': {'node':
>>> 'drive0', 'name': 'bitmap0', 'granularity': 512}}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 0 65536"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 983040 65536"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'human-monitor-command', 'arguments':
>>> {'command-line': 'qemu-io drive0 "write 655873 4096"'}}
>>> DEBUG:QMP:<<< {u'return': u''}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> DEBUG:QMP:<<< {u'return': {u'sha256':
>>> u'01d2ebedcb8f549a2547dbf8e231c410e3e747a9479e98909fc936e0035cf8b1'}}
>>> DEBUG:QMP:>>> {'execute': 'migrate', 'arguments': {'uri': 'exec: cat >
>>> /tmp/t0/test-qcow2-file/mig_file'}}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 466831}, u'data': {u'status': u'setup'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 472171}, u'data': {u'pass': 1}, u'event': u'MIGRATION_PASS'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 472233}, u'data': {u'status': u'active'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 571246}, u'data': {u'pass': 2}, u'event': u'MIGRATION_PASS'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 571665}, u'event': u'STOP'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 571934}, u'data': {u'pass': 3}, u'event': u'MIGRATION_PASS'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 574302}, u'data': {u'status': u'completed'}, u'event': u'MIGRATION'}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> DEBUG:QMP:<<< {u'return': {u'sha256':
>>> u'01d2ebedcb8f549a2547dbf8e231c410e3e747a9479e98909fc936e0035cf8b1'}}
>>> DEBUG:QMP:>>> {'execute': 'cont'}
>>> DEBUG:QMP:<<< {u'timestamp': {u'seconds': 1548263493, u'microseconds':
>>> 576305}, u'event': u'RESUME'}
>>> DEBUG:QMP:<<< {u'return': {}}
>>> DEBUG:QMP:>>> {'execute': 'x-debug-block-dirty-bitmap-sha256', 'arguments':
>>> {'node': 'drive0', 'name': 'bitmap0'}}
>>> DEBUG:QMP:<<< {u'return': {u'sha256':
>>> u'01d2ebedcb8f549a2547dbf8e231c410e3e747a9479e98909fc936e0035cf8b1'}}
>>> DEBUG:QMP:>>> {'execute': 'quit'}
>>> WARNING:qemu:qemu received signal 6:
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -chardev
>>> socket,id=mon,path=/tmp/t0/test-qcow2-file/tmpG2ZjZB/qemua-4834-monitor.sock
>>> -mon chardev=mon,mode=control -display none -vga none -qtest
>>> unix:path=/tmp/t0/test-qcow2-file/qemua-4834-qtest.sock -machine
>>> accel=qtest -nodefaults -machine accel=qtest -drive
>>> if=virtio,id=drive0,file=/tmp/t0/test-qcow2-file/disk_a,format=qcow2,cache=writethrough
>>> WARNING:qemu:qemu received signal 6:
>>> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -chardev
>>> socket,id=mon,path=/tmp/t0/test-qcow2-file/tmpG2ZjZB/qemua-4834-monitor.sock
>>> -mon chardev=mon,mode=control -display none -vga none -qtest
>>> unix:path=/tmp/t0/test-qcow2-file/qemua-4834-qtest.sock -machine
>>> accel=qtest -nodefaults -machine accel=qtest -drive
>>> if=virtio,id=drive0,file=/tmp/t0/test-qcow2-file/disk_a,format=qcow2,cache=writethrough
>>>
>>> ======================================================================
>>> FAIL: test_do_test_migration_resume_source_not_persistent__migbitmap
>>> (__main__.TestDirtyBitmapMigration)
>>> ----------------------------------------------------------------------
>>> Traceback (most recent call last):
>>> File "169", line 206, in <lambda>
>>> setattr(klass, 'test_' + method + name, lambda self: mc(self))
>>> File "169", line 125, in do_test_migration_resume_source
>>> self.assertEqual(log, '')
>>> AssertionError: "qemu-system-x86_64: invalid runstate transition: 'running'
>>> -> 'postmigrate'\n" != ''
>>
>> So here looks more like the shutdown is the problem.
>
> It doesn't repeat for me here in a few minutes of testing; can you get
> me a full all-threads backtrace for those asserts?
>
Reproduced for me after about 1-3 minutes on tmpfs, did you try it?
--
Best regards,
Vladimir
Re: [Qemu-devel] Aborts in iotest 169, Vladimir Sementsov-Ogievskiy, 2019/01/24
Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169, Kevin Wolf, 2019/01/24
Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169, Dr. David Alan Gilbert, 2019/01/24
Re: [Qemu-devel] [Qemu-block] Aborts in iotest 169, Kevin Wolf, 2019/01/24