qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [Bug 692570] [NEW] APIC Latency causing BSOD.


From: La' Maze Johnson
Subject: [Qemu-devel] [Bug 692570] [NEW] APIC Latency causing BSOD.
Date: Mon, 20 Dec 2010 14:15:48 -0000

Public bug reported:

I have a Proxmox Server with the following specs:

Version:

pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-28
pve-kernel-2.6.32-4-pve: 2.6.32-28
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-2
ksm-control-daemon: 1.0-4

VM Configuration:

name: TS64
ide2: none,media=cdrom
bootdisk: ide0
ostype: w2k3
ide0: data:vm-104-disk-1
memory: 10240
sockets: 1
vlan0: virtio=C6:4C:B3:BB:AD:67
onboot: 1
cores: 4
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1

CPU 2x Xeon Quad Core E5620 2.4GHZ Processors:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
cpu MHz : 2400.323
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 19
initial apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall pdpe1gb rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf 
pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 
sse4_2 popcnt lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.19
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

Performance:

CPU BOGOMIPS: 76803.21
REGEX/SECOND: 850066
HD SIZE: 33.96 GB (/dev/mapper/pve-root)
BUFFERED READS: 333.03 MB/sec
AVERAGE SEEK TIME: 6.10 ms
FSYNCS/SECOND: 2948.85
DNS EXT: 131.42 ms
DNS INT: 1.28 ms

I've been successfully running 2 Windows 2003 32-Bit Standard Edition
Servers on this server for over a month now. Both were migrations from
actual physical servers. However, I've continued to receive random
crashes on a Windows 2003 64-bit standard edition running terminal
services, which was a fresh install. The server runs fine for hours
under a decent load (20 users) and then will crash with a 3B bug check
code (System_Service_Exception). I opened a ticket with Microsoft and
submitted multiple memory dumps and their engineers suggested the
following:

Dump Analyses Result:
===================

What happened is that the OS initiated an APIC /software interrupt. This
is handled by the APIC in real hardware. In your Virtual Environment
case where you are using Linux based VM – Proxmox, the VM implementation
somehow has to make it happen on a virtual machine with the same latency
in the virtual APIC. The problem is that there is a latency between when
it was initiated and when it happened.


Below are the details for understanding the process or concept of APIC 
interrupts:

What the Local APIC Is
The Local APIC (LAPIC) is a circuit that is part of the CPU chip. It contains 
these basic elements:
A mechanism for generating
1. interrupts
2. A mechanism for accepting interrupts
3. A timer

If you have a multiprocessor system, the APIC's are wired together so
they can communicate. So the LAPIC on CPU 0 can communicate with the
LAPIC on CPU 1, etc.


What the IO APIC Is

This is a separate chip that is wired to the Local APIC's so it can forward 
interrupts on to the CPU chips. It is programmed similar to the 8259's but has 
more flexibility.
It is wired to the same bus as the Local APIC's so it can communicate with them.

Note:- In our scenario, it’s all Virtualized interrupts or calls because of 
hypervisor in picture and thus we need to contact the VM application vendor to 
get a check of this latency issue in APIC interrupt.
------------------------------------------------End of 
Message----------------------------------


Their engineers are saying that there is a latency issue with APIC. I'm
not exactly sure how this can be corrected. Is this a known issue and is
their a solution to this problem. I love Proxmox, but my main reason for
using it was to upgrade my terminal server to better hardware, while
leveraging it for other virtual machines as well.  The forum
administrator for Proxmox, suggested I bring this issue to your
attention.

** Affects: qemu
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/692570

Title:
  APIC Latency causing BSOD.

Status in QEMU:
  New

Bug description:
  I have a Proxmox Server with the following specs:

Version:

pve-manager: 1.7-10 (pve-manager/1.7/5323)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-28
pve-kernel-2.6.32-4-pve: 2.6.32-28
qemu-server: 1.1-25
pve-firmware: 1.0-9
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-9
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-2
ksm-control-daemon: 1.0-4

VM Configuration:

name: TS64
ide2: none,media=cdrom
bootdisk: ide0
ostype: w2k3
ide0: data:vm-104-disk-1
memory: 10240
sockets: 1
vlan0: virtio=C6:4C:B3:BB:AD:67
onboot: 1
cores: 4
boot: cad
freeze: 0
cpuunits: 1000
acpi: 1
kvm: 1

CPU 2x Xeon Quad Core E5620 2.4GHZ Processors:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
cpu MHz : 2400.323
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 19
initial apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall pdpe1gb rdtscp lm 
constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf 
pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 
sse4_2 popcnt lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.19
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

Performance:

CPU BOGOMIPS: 76803.21
REGEX/SECOND: 850066
HD SIZE: 33.96 GB (/dev/mapper/pve-root)
BUFFERED READS: 333.03 MB/sec
AVERAGE SEEK TIME: 6.10 ms
FSYNCS/SECOND: 2948.85
DNS EXT: 131.42 ms
DNS INT: 1.28 ms

I've been successfully running 2 Windows 2003 32-Bit Standard Edition Servers 
on this server for over a month now. Both were migrations from actual physical 
servers. However, I've continued to receive random crashes on a Windows 2003 
64-bit standard edition running terminal services, which was a fresh install. 
The server runs fine for hours under a decent load (20 users) and then will 
crash with a 3B bug check code (System_Service_Exception). I opened a ticket 
with Microsoft and submitted multiple memory dumps and their engineers 
suggested the following:

Dump Analyses Result:
===================

What happened is that the OS initiated an APIC /software interrupt. This is 
handled by the APIC in real hardware. In your Virtual Environment case where 
you are using Linux based VM – Proxmox, the VM implementation somehow has to 
make it happen on a virtual machine with the same latency in the virtual APIC. 
The problem is that there is a latency between when it was initiated and when 
it happened.


Below are the details for understanding the process or concept of APIC 
interrupts:

What the Local APIC Is
The Local APIC (LAPIC) is a circuit that is part of the CPU chip. It contains 
these basic elements:
A mechanism for generating
1. interrupts
2. A mechanism for accepting interrupts
3. A timer

If you have a multiprocessor system, the APIC's are wired together so they can 
communicate. So the LAPIC on CPU 0 can communicate with the LAPIC on CPU 1, etc.


What the IO APIC Is

This is a separate chip that is wired to the Local APIC's so it can forward 
interrupts on to the CPU chips. It is programmed similar to the 8259's but has 
more flexibility.
It is wired to the same bus as the Local APIC's so it can communicate with them.

Note:- In our scenario, it’s all Virtualized interrupts or calls because of 
hypervisor in picture and thus we need to contact the VM application vendor to 
get a check of this latency issue in APIC interrupt.
------------------------------------------------End of 
Message----------------------------------



Their engineers are saying that there is a latency issue with APIC. I'm not 
exactly sure how this can be corrected. Is this a known issue and is their a 
solution to this problem. I love Proxmox, but my main reason for using it was 
to upgrade my terminal server to better hardware, while leveraging it for other 
virtual machines as well.  The forum administrator for Proxmox, suggested I 
bring this issue to your attention.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]