[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs
|
From: |
Fabiano Rosas |
|
Subject: |
Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER |
|
Date: |
Mon, 09 Oct 2023 13:50:08 -0300 |
Fabiano Rosas <farosas@suse.de> writes:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
>>> >> >> + /*
>>> >> >> + * Make sure both QEMU instances will go into RECOVER stage,
>>> >> >> then test
>>> >> >> + * kicking them out using migrate-pause.
>>> >> >> + */
>>> >> >> + wait_for_postcopy_status(from, "postcopy-recover");
>>> >> >> + wait_for_postcopy_status(to, "postcopy-recover");
>>> >> >
>>> >> > Is this wait out of place? I think we're trying to resume too fast
>>> >> > after
>>> >> > migrate_recover():
>>> >> >
>>> >> > # {
>>> >> > # "error": {
>>> >> > # "class": "GenericError",
>>> >> > # "desc": "Cannot resume if there is no paused migration"
>>> >> > # }
>>> >> > # }
>>> >> >
>>> >>
>>> >> Ugh, sorry about the long lines:
>>> >>
>>> >> {
>>> >> "error": {
>>> >> "class": "GenericError",
>>> >> "desc": "Cannot resume if there is no paused migration"
>>> >> }
>>> >> }
>>> >
>>> > Sorry I didn't get you here. Could you elaborate your question?
>>> >
>>>
>>> The test is sometimes failing with the above message.
>>>
>>> But indeed my question doesn't make sense. I forgot migrate_recover
>>> happens on the destination. Nevermind.
>>>
>>> The bug is still present nonetheless. We're going into migrate_prepare
>>> in some state other than POSTCOPY_PAUSED.
>>
>> Oh I see. Interestingly I cannot reproduce on my host, just like last
>> time..
>>
>> What is your setup for running the test? Anything special? Here's my
>> cmdline:
>
> The crudest oneliner:
>
> for i in $(seq 1 9999); do echo "$i ============="; \
> QTEST_QEMU_BINARY=./qemu-system-x86_64 \
> ./tests/qtest/migration-test -r /x86_64/migration/postcopy/recovery || break
> ; done
>
> I suspect my system has something specific to it that affects the timing
> of the tests. But I have no idea what it could be.
>
> $ lscpu
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Address sizes: 39 bits physical, 48 bits virtual
> Byte Order: Little Endian
> CPU(s): 16
> On-line CPU(s) list: 0-15
> Vendor ID: GenuineIntel
> Model name: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
> CPU family: 6
> Model: 141
> Thread(s) per core: 2
> Core(s) per socket: 8
> Socket(s): 1
> Stepping: 1
> CPU max MHz: 4800.0000
> CPU min MHz: 800.0000
> BogoMIPS: 4992.00
>
>>
>> $ cat reproduce.sh
>> index=$1
>> loop=0
>>
>> while :; do
>> echo "Starting loop=$loop..."
>> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test
>> -p /x86_64/migration/postcopy/recovery/double-failures
>> if [[ $? != 0 ]]; then
>> echo "index $index REPRODUCED (loop=$loop) !"
>> break
>> fi
>> loop=$(( loop + 1 ))
>> done
>>
>> Survives 200+ loops and kept going.
>>
>> However I think I saw what's wrong here, could you help try below fixup?
>>
>
> Sure. I won't get to it until tomorrow though.
It seems to have fixed the issue. 3500 iterations and still going.
- Re: [PATCH v3 03/10] migration: Refactor error handling in source return path, (continued)
- [PATCH v3 04/10] migration: Deliver return path file error to migrate state too, Peter Xu, 2023/10/04
- [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Peter Xu, 2023/10/04
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Fabiano Rosas, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Fabiano Rosas, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Peter Xu, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Fabiano Rosas, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Peter Xu, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Fabiano Rosas, 2023/10/05
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER,
Fabiano Rosas <=
- Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER, Peter Xu, 2023/10/10
[PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu, Peter Xu, 2023/10/04
[PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status, Peter Xu, 2023/10/04