[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-c

From: Jason Wang
Subject: Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device
Date: Mon, 27 Jul 2020 16:51:23 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 2020/7/27 下午4:41, Cornelia Huck wrote:
On Mon, 27 Jul 2020 15:38:12 +0800
Jason Wang <jasowang@redhat.com> wrote:

On 2020/7/27 下午2:43, Cornelia Huck wrote:
On Sat, 25 Jul 2020 08:40:07 +0800
Jason Wang <jasowang@redhat.com> wrote:
On 2020/7/24 下午11:34, Cornelia Huck wrote:
On Fri, 24 Jul 2020 11:17:57 -0400
"Michael S. Tsirkin"<mst@redhat.com>  wrote:
On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote:
On Fri, 24 Jul 2020 09:30:58 -0400
"Michael S. Tsirkin"<mst@redhat.com>  wrote:
On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote:
When I start qemu with a second virtio-net-ccw device (i.e. adding
-device virtio-net-ccw in addition to the autogenerated device), I get
a segfault. gdb points to

#0  0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>,
       config=0x55d6ad9e3f80 "RT") at 
146         if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {

(backtrace doesn't go further)
The core was incomplete, but running under gdb directly shows that it
is just a bog-standard config space access (first for that device).

The cause of the crash is that nc->peer is not set... no idea how that
can happen, not that familiar with that part of QEMU. (Should the code
check, or is that really something that should not happen?)

What I don't understand is why it is set correctly for the first,
autogenerated virtio-net-ccw device, but not for the second one, and
why virtio-net-pci doesn't show these problems. The only difference
between -ccw and -pci that comes to my mind here is that config space
accesses for ccw are done via an asynchronous operation, so timing
might be different.
Hopefully Jason has an idea. Could you post a full command line
please? Do you need a working guest to trigger this? Does this trigger
on an x86 host?
Yes, it does trigger with tcg-on-x86 as well. I've been using

s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on
-m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001
-drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0
-device virtio-net-ccw

It seems it needs the guest actually doing something with the nics; I
cannot reproduce the crash if I use the old advent calendar moon buggy
image and just add a virtio-net-ccw device.

(I don't think it's a problem with my local build, as I see the problem
both on my laptop and on an LPAR.)
It looks to me we forget the check the existence of peer.

Please try the attached patch to see if it works.
Thanks, that patch gets my guest up and running again. So, FWIW,

Tested-by: Cornelia Huck <cohuck@redhat.com>

Any idea why this did not hit with virtio-net-pci (or the autogenerated
virtio-net-ccw device)?

It can be hit with virtio-net-pci as well (just start without peer).
Hm, I had not been able to reproduce the crash with a 'naked' -device
virtio-net-pci. But checking seems to be the right idea anyway.

Sorry for being unclear, I meant for networking part, you just need start without peer, and you need a real guest (any Linux) that is trying to access the config space of virtio-net.


For autogenerated virtio-net-cww, I think the reason is that it has
already had a peer set.
Ok, that might well be.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]