Re: [Qemu-devel] Lock contention in QEMU

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Lock contention in QEMU

From:	Weiwei Jia
Subject:	Re: [Qemu-devel] Lock contention in QEMU
Date:	Wed, 14 Dec 2016 20:06:10 -0500

Hi Stefan,

Thanks for your reply. Please see the inline replies.

On Wed, Dec 14, 2016 at 2:31 PM, Stefan Hajnoczi <address@hidden> wrote:
> On Wed, Dec 14, 2016 at 12:58:11AM -0500, Weiwei Jia wrote:
>> I find the timeslice of vCPU thread in QEMU/KVM is unstable when there
>> are lots of read requests (for example, read 4KB each time (8GB in
>> total) from one file) from Guest OS. I also find that this phenomenon
>> may be caused by lock contention in QEMU layer. I find this problem
>> under following workload.
>>
>> Workload settings:
>> In VMM, there are 6 pCPUs which are pCPU0, pCPU1, pCPU2, pCPU3, pCPU4,
>> pCPU5. There are two Kernel Virtual Machines (VM1 and VM2) upon VMM.
>> In each VM, there are 5 vritual CPUs (vCPU0, vCPU1, vCPU2, vCPU3,
>> vCPU4). vCPU0 in VM1 and vCPU0 in VM2 are pinned to pCPU0 and pCPU5
>> separately to handle interrupts dedicatedly. vCPU1 in VM1 and vCPU1 in
>> VM2 are pinned to pCPU1; vCPU2 in VM1 and vCPU2 in VM2 are pinned to
>> pCPU2; vCPU3 in VM1 and vCPU3 in VM2 are pinned to pCPU3; vCPU4 in VM1
>> and vCPU4 in VM2 are pinned to pCPU4. Besides vCPU0 in VM2 (pinned to
>> pCPU5), other vCPUs all have one CPU intensive thread (while(1){i++})
>> upon each of them in VM1 and VM2 to avoid the vCPU to be idle. In VM1,
>> I start one I/O thread on vCPU2, which the I/O thread reads 4KB from
>> one file each time (reads 8GB in total). The I/O scheduler in VM1 and
>> VM2 is NOOP. The I/O scheduler in VMM is CFQ. I also pinned the I/O
>> worker threads launched by QEMU to pCPU5 (note: there is no CPU
>> intensive thread on pCPU5 so the I/O requests will be handled by QEMU
>> I/O thread workers ASAP). The process scheduling class in VM and VMM
>> is CFS.
>
> Did you pin the QEMU main loop to pCPU5?  This is the QEMU process' main
> thread and it handles ioeventfd (virtqueue kick) and thread pool
> completions.

No, I did not pin main loop to pCPU5. Do you mean If I pin QEMU main
loop to pCPU5 under above workload, the timeslice of vCPU2 thread will
be stable even though there are lots of I/O requests? I didn't use
virtio for VM and I use SCSI. My whole VM xml configuration file is as
follows.

<domain type='kvm' id='2'>
  <name>kvm1</name>
  <uuid>8e9c4603-c4b5-fa41-b251-1dc4ffe1872c</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='3'/>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/home/images/kvm1.img'/>
      <target dev='hda' bus='scsi'/>
      <alias name='scsi0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='block' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='scsi' index='0'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x1'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:01:ab:ca'/>
      <source network='default'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/13'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/13'>
      <source path='/dev/pts/13'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='none'/>
</domain>



>
>>
>> Linux Kernel version for VMM is: 3.16.39
>> Linux Kernel version for VM1 and VM2 is: 4.7.4
>> QEMU emulator version is: 2.0.0
>>
>> When I test above workload, I find the timeslice of vCPU2 thread
>> jitters very much. I suspect this is triggered by lock contention in
>> QEMU layer since my debug log in front of VMM Linux Kernel's
>> schedule->__schedule->context_switch is like following. Once the
>> timeslice jitters very much, following debug information will appear.
>>
>> 7097537 Dec 13 11:22:33 mobius04 kernel: [39163.015789] Call Trace:
>> 7097538 Dec 13 11:22:33 mobius04 kernel: [39163.015791]
>> [<ffffffff8176b2f0>] dump_stack+0x64/0x84
>> 7097539 Dec 13 11:22:33 mobius04 kernel: [39163.015793]
>> [<ffffffff8176bf85>] __schedule+0x5b5/0x960
>> 7097540 Dec 13 11:22:33 mobius04 kernel: [39163.015794]
>> [<ffffffff8176c409>] schedule+0x29/0x70
>> 7097541 Dec 13 11:22:33 mobius04 kernel: [39163.015796]
>> [<ffffffff810ef4d8>] futex_wait_queue_me+0xd8/0x150
>> 7097542 Dec 13 11:22:33 mobius04 kernel: [39163.015798]
>> [<ffffffff810ef6fb>] futex_wait+0x1ab/0x2b0
>> 7097543 Dec 13 11:22:33 mobius04 kernel: [39163.015800]
>> [<ffffffff810eef00>] ? get_futex_key+0x2d0/0x2e0
>> 7097544 Dec 13 11:22:33 mobius04 kernel: [39163.015804]
>> [<ffffffffc0290105>] ? __vmx_load_host_state+0x125/0x170 [kv
>> m_intel]
>> 7097545 Dec 13 11:22:33 mobius04 kernel: [39163.015805]
>> [<ffffffff810f1275>] do_futex+0xf5/0xd20
>> 7097546 Dec 13 11:22:33 mobius04 kernel: [39163.015813]
>> [<ffffffffc0222690>] ? kvm_vcpu_ioctl+0x100/0x560 [kvm]
>> 7097547 Dec 13 11:22:33 mobius04 kernel: [39163.015816]
>> [<ffffffff810b06f0>] ? __dequeue_entity+0x30/0x50
>> 7097548 Dec 13 11:22:33 mobius04 kernel: [39163.015818]
>> [<ffffffff81013d06>] ? __switch_to+0x596/0x690
>> 7097549 Dec 13 11:22:33 mobius04 kernel: [39163.015820]
>> [<ffffffff811f9f23>] ? do_vfs_ioctl+0x93/0x520
>> 7097550 Dec 13 11:22:33 mobius04 kernel: [39163.015822]
>> [<ffffffff810f1f1d>] SyS_futex+0x7d/0x170
>> 7097551 Dec 13 11:22:33 mobius04 kernel: [39163.015824]
>> [<ffffffff8116d1b2>] ? fire_user_return_notifiers+0x42/0x50
>> 7097552 Dec 13 11:22:33 mobius04 kernel: [39163.015826]
>> [<ffffffff810154b5>] ? do_notify_resume+0xc5/0x100
>> 7097553 Dec 13 11:22:33 mobius04 kernel: [39163.015828]
>> [<ffffffff81770a8d>] system_call_fastpath+0x1a/0x1f
>>
>>
>> If true, I think this may be a scalability problem caused by QEMU I/O
>> part. Do we have a feature in QEMU to avoid this? Would you please
>> give me some suggestions about how to make the timeslice of vCPU2
>> thread stable even though there are lots of I/O Read requests on it.
>
> Yes, there is a way to reduce jitter caused by the QEMU global mutex:
>
> qemu -object iothread,id=iothread0 \
>      -drive if=none,id=drive0,file=test.img,format=raw,cache=none \
>      -device virtio-blk-pci,iothread=iothread0,drive=drive0
>
> Now the ioeventfd and thread pool completions will be processed in
> iothread0 instead of the QEMU main loop thread.  This thread does not
> take the QEMU global mutex so vcpu execution is not hindered.
>
> This feature is called virtio-blk dataplane.
>
> You can query IOThread thread IDs using the query-iothreads QMP command.
> This will allow you to pin iothread0 to pCPU5.
>
> Please let us know if this helps.

Does this feature only work for VirtIO? Does it work for SCSI or IDE?

Thank you,
Weiwei Jia

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/14
- Re: [Qemu-devel] Lock contention in QEMU, Stefan Hajnoczi, 2016/12/14
  - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia <=
    - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/15
    - Re: [Qemu-devel] Lock contention in QEMU, Stefan Hajnoczi, 2016/12/15
    - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/15
    - Re: [Qemu-devel] Lock contention in QEMU, Stefan Hajnoczi, 2016/12/16
    - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/16
    - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/16
    - Re: [Qemu-devel] Lock contention in QEMU, Stefan Hajnoczi, 2016/12/19
    - Re: [Qemu-devel] Lock contention in QEMU, Weiwei Jia, 2016/12/19
    - Re: [Qemu-devel] Lock contention in QEMU, Stefan Hajnoczi, 2016/12/15
    - Re: [Qemu-devel] Lock contention in QEMU, Paolo Bonzini, 2016/12/15

Prev by Date: Re: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver
Next by Date: Re: [Qemu-devel] [PATCHv3 2/5] pseries: Stubs for HPT resizing
Previous by thread: Re: [Qemu-devel] Lock contention in QEMU
Next by thread: Re: [Qemu-devel] Lock contention in QEMU
Index(es):
- Date
- Thread