qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [Qemu-devel] [RFC PATCH v2 1/1] s390x/css: unrestrict c


From: Halil Pasic
Subject: Re: [qemu-s390x] [Qemu-devel] [RFC PATCH v2 1/1] s390x/css: unrestrict cssids
Date: Wed, 29 Nov 2017 17:30:15 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0


On 11/29/2017 12:47 PM, Cornelia Huck wrote:
> On Wed, 29 Nov 2017 16:17:35 +0800
> Dong Jia Shi <address@hidden> wrote:
> 
>> * Halil Pasic <address@hidden> [2017-11-28 14:07:58 +0100]:
>>
>> [...]
>>> The auto-generated bus ids are affected by both changes. We hope to not
>>> encounter any auto-generated bus ids in production as Libvirt is always
>>> explicit about the bus id.  Since 8ed179c937 ("s390x/css: catch section
>>> mismatch on load", 2017-05-18) the worst that can happen because the same
>>> device ended up having a different bus id is a cleanly failed migration.
>>> I find it hard to reason about the impact of changed auto-generated bus
>>> ids on migration for command line users as I don't know which rules is
>>> such an user supposed to follow.  
>> For this paragraph, Halil pointed to me a case that he is thinking of.
>> 1. VM configuration with 3 devices:
>>   -device virtio (e.g. virtio-blk-ccw,id=disk0)
>>   -device vfio-ccw (e.g. id=vfio0)
>>   -device virtio (e.g. virtio-rng-ccw,id=rng0)
>> 2. Start the vm.
>> 3. device_del vfio0
>> 4. migrate "exec:gzip -c > /tmp/tmp_vmstate.gz"
>> 5. modify cmd line from step 1 by removing the vfio0 device, and adding:
>>    -incoming "exec:gzip -c -d /tmp/tmp_vmstate.gz"
>>
>> Let me list my test results here for everybody's reference.
>>
>> W/o this patch
>> ==============
>>
>> ------------+---------------+-------------
>>             | squashing off | squashing on
>> ------------+---------------+-------------
>>     auto id |        F      |     F
>> ------------+---------------+-------------
>> explicit id |        F      |     S
>> ------------+---------------+-------------
>>
>> T1. squashing off + auto id
>>   qemu-system-s390x: vmstate: get_nullptr expected VMS_NULLPTR_MARKER
>>   qemu-system-s390x: Failed to load s390_css:css
>>   qemu-system-s390x: error while loading state for instance 0x0 of device 
>> 's390_css'
>>   qemu-system-s390x: load of migration failed: Invalid argument
>> [Fail due to css mismatch - there is no css 0 in the new vm.]
>>
>> T2. squashing off + explicit given id
>>   qemu-system-s390x: vmstate: get_nullptr expected VMS_NULLPTR_MARKER
>>   qemu-system-s390x: Failed to load s390_css:css
>>   qemu-system-s390x: error while loading state for instance 0x0 of device 
>> 's390_css'
>>   qemu-system-s390x: load of migration failed: Invalid argument
>> [Fail due to css mismatch - there is no css 0 in the new vm.]
> Hmm... so should we even try to migrate an empty css 0? It only exists
> because we have created a device that we had to detach anyway because
> it was non-migrateable...
> 
> [Probably no easy way to deal with this, though.]
> 

We could make the thing go away when the last device is gone.
I see a general problem with implicitly generated shared stuff.

Obviously we can't fix the past.

@Dong Jia:

Thanks for doing the experiments and publishing your findings.

Halil





reply via email to

[Prev in Thread] Current Thread [Next in Thread]