[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size ex
From: |
Natanael Copa |
Subject: |
Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error |
Date: |
Sat, 23 Feb 2019 16:55:23 +0100 |
On Fri, 22 Feb 2019 14:04:20 +0000
Stefan Hajnoczi <address@hidden> wrote:
> On Fri, Feb 22, 2019 at 12:57 PM Fernando Casas Schössow
> <address@hidden> wrote:
>
> I have CCed Natanael Copa, qemu package maintainer in Alpine Linux.
>
> Fernando: Can you confirm that the bug occurs with an unmodified
> Alpine Linux qemu binary?
I wonder exactly which binary it is. What arcitecture (x86 or x86_64)
and what binary. Is it ddynamic or statically linked qemu-system-x86_64?
> Richard: Commit 7db2145a6826b14efceb8dd64bfe6ad8647072eb ("bswap: Add
> host endian unaligned access functions") introduced the unaligned
> memory access functions in question here. Please see below for
> details on the bug - basically QEMU code assumes they are atomic, but
> that is not guaranteed :(. Any ideas for how to fix this?
>
> Natanael: It seems likely that the qemu package in Alpine Linux
> suffers from a compilation issue resulting in a broken QEMU. It may
> be necessary to leave the compiler optimization flag alone in APKBUILD
> to work around this problem.
>
> Here are the details. QEMU relies on the compiler turning
> memcpy(&dst, &src, 2) turning into a load instruction in
> include/qemu/bswap.h:lduw_he_p() (explanation below):
>
> /* Any compiler worth its salt will turn these memcpy into native unaligned
> operations. Thus we don't need to play games with packed attributes, or
> inline byte-by-byte stores. */
>
> static inline int lduw_he_p(const void *ptr)
> {
> uint16_t r;
> memcpy(&r, ptr, sizeof(r));
> return r;
> }
>
> Here is the disassembly snippet of virtqueue_pop() from Fedora 29 that
> shows the load instruction:
>
> 398166: 0f b7 42 02 movzwl 0x2(%rdx),%eax
> 39816a: 66 89 43 32 mov %ax,0x32(%rbx)
>
> Here is the instruction sequence in the Alpine Linux binary:
>
> 455562: ba 02 00 00 00 mov $0x2,%edx
> 455567: e8 74 24 f3 ff callq 3879e0 <address@hidden>
> 45556c: 0f b7 44 24 42 movzwl 0x42(%rsp),%eax
> 455571: 66 41 89 47 32 mov %ax,0x32(%r15)
>
> It's calling memcpy instead of using a load instruction.
I tried to find this section. How do you get the assembly listing of
relevant secion? I tried to do "disas virtio_pop" from
`gdb /usr/bin/qemu-system-x86_64` from the binary in alpine edge. I
could find 2 memcpy but none of them look like a 16 bit operation after:
0x00000000004551f1 <+353>: mov 0x10(%rsp),%rdi
0x00000000004551f6 <+358>: mov $0x10,%edx
0x00000000004551fb <+363>: callq 0x3879e0 <address@hidden>
0x0000000000455200 <+368>: movzwl 0x5c(%rsp),%eax
0x0000000000455205 <+373>: test $0x4,%al
0x0000000000455207 <+375>: je 0x4552aa <virtqueue_pop+538>
....
0x0000000000455291 <+513>: mov 0x10(%rsp),%rdi
0x0000000000455296 <+518>: mov $0x10,%edx
0x000000000045529b <+523>: callq 0x3879e0 <address@hidden>
0x00000000004552a0 <+528>: mov %rbp,0x20(%rsp)
0x00000000004552a5 <+533>: movzwl 0x5c(%rsp),%eax
0x00000000004552aa <+538>: lea 0x20e0(%rsp),%rdi
0x00000000004552b2 <+546>: xor %r11d,%r11d
0x00000000004552b5 <+549>: mov %r15,0x38(%rsp)
>
> Fernando found that QEMU's virtqueue_pop() function sees bogus values
> when loading a 16-bit guest RAM location. Paolo figured out that the
> bogus value can be produced by memcpy() when another thread is
> updating the 16-bit memory location simultaneously. This is a race
> condition between one thread loading the 16-bit value and another
> thread storing it (in this case a guest vcpu thread). Sometimes
> memcpy() may load one old byte and one new byte, resulting in a bogus
> value.
>
> The symptom that Fernando experienced is a "Virtqueue size exceeded"
> error message from QEMU and then the virtio-blk or virtio-scsi device
> stops working. This issue potentially affects other device emulation
> code in QEMU as well, not just virtio devices.
>
> For the time being, I suggest tweaking the APKBUILD so the memcpy() is
> not generated. Hopefully QEMU can make the code more portable in the
> future so the compiler always does the expected thing, but this may
> not be easily possible.
I was thinking of something in the lines of:
typedef volatile uint16_t __attribute__((__may_alias__)) volatile_uint16_t;
static inline int lduw_he_p(const void *ptr)
{
volatile_uint16_t r = *(volatile_uint16_t*)ptr;
return r;
}
I can test different CFLAGS with and without the _FORTIFY_SOURCE and
with different variants of memcpy (like __builtint_memcpy etc) but i
need find a way to get the correct assembly output so I know if/when I
have found something that works.
Thanks!
-nc
>
> Stefan
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, (continued)
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Dr. David Alan Gilbert, 2019/02/22
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Paolo Bonzini, 2019/02/22
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Dr. David Alan Gilbert, 2019/02/22
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Natanael Copa, 2019/02/23
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Paolo Bonzini, 2019/02/26
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Fernando Casas Schössow, 2019/02/28
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error,
Natanael Copa <=
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Peter Maydell, 2019/02/23
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Natanael Copa, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Peter Maydell, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Fernando Casas Schössow, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Natanael Copa, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Peter Maydell, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Natanael Copa, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Fernando Casas Schössow, 2019/02/25
- Message not available
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Fernando Casas Schössow, 2019/02/25
- Re: [Qemu-devel] [Qemu-block] Guest unresponsive after Virtqueue size exceeded error, Peter Maydell, 2019/02/28