qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] E5-2620v2 - emulation stop error


From: Kevin O'Connor
Subject: Re: [Qemu-devel] E5-2620v2 - emulation stop error
Date: Wed, 11 Mar 2015 12:37:39 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

On Wed, Mar 11, 2015 at 03:53:07PM +0000, Dr. David Alan Gilbert wrote:
> * Kevin O'Connor (address@hidden) wrote:
> > On Wed, Mar 11, 2015 at 01:45:57PM +0000, Dr. David Alan Gilbert wrote:
> > > * Bandan Das (address@hidden) wrote:
> > > > "Dr. David Alan Gilbert" <address@hidden> writes:
> > > > > while true; do (sleep 5; echo -e 
> > > > > '\001cq\n')|/opt/qemu-try-world3/bin/qemu-system-x86_64 -machine 
> > > > > pc-i440fx-2.0,accel=kvm -m 1024 -smp 128 -nographic -device sga 2>&1 
> > > > > | tee /tmp/qemu.op; grep "internal error" /tmp/qemu.op -q && break; 
> > > > > done

That is a truly impressive command line, BTW.

> > > > > address@hidden qemu-world3]# git bisect bad
> > > > > 21f5826a04d38e19488f917e1eef22751490c769 is the first bad commit
> > > > 
> > > > I can reproduce this on E5-2620 v2 with  David's "while true" test.
> > > > (The emulation failure I mean, not the suberror 2 that Andrey is seeing)
> > > > The commit that seems to have introduced this is -
> > > > 
> > > > commit 0673b7870063a3affbad9046fb6d385a4e734c19
> > > > Author: Kevin O'Connor <address@hidden>
> > > > Date:   Sat May 24 10:49:50 2014 -0400
> > > > 
> > > >     smp: Replace QEMU SMP init assembler code with C; run only in 32bit 
> > > > mode.
> > [...]
> > > Turning on debug logging
> > > ( -chardev file,id=log,path=/tmp/debugcon.$$ -device 
> > > isa-debugcon,chardev=log,iobase=0x402 )
> > > 
> > > SeaBIOS (version 
> > > rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
> > [...]
> > > Found 1 cpu(s) max supported 128 cpu(s)
> > 
> > Something is very odd here.  When I run the above command (on an older
> > AMD machine) I get:
> > 
> > Found 128 cpu(s) max supported 128 cpu(s)
> > 
> > That first value (1 vs 128) comes from QEMU (via cmos index 0x5f).
> > That is, during smp init, SeaBIOS expects QEMU to tell it how many
> > cpus are active, and SeaBIOS waits until that many CPUs check in from
> > its SIPI request before proceeding.
> > 
> > I wonder if QEMU reported only 1 active cpu via that cmos register,
> > but more were actually active.  If that was the case, it could
> > certainly explain the failure - as multiple cpus could be running
> > without the sipi trapoline in place.
> > 
> > What does the log look like on a non-failure case?
> 
> I had to drop down from 128 to get a working run with debug; here
> are two runs with -smp 20   the first one worked, the second one
> failed.
[...]
> =========== Working ===========
> 
> SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
[...]
> Found 20 cpu(s) max supported 20 cpu(s)
[...]
> =========== Broken ===========
> 
> SeaBIOS (version rel-1.8.0-0-g4c59f5d-20150219_092859-nilsson.home.kraxel.org)
[...]
> Found 1 cpu(s) max supported 20 cpu(s)

So, I couldn't get this to fail on my older AMD machine at all with
the default SeaBIOS code.  But, when I change the code with the patch
below, it failed right away.

KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=00000000 ECX=00000000 EDX=000fd2b8
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=000fd2c1 EFL=00000007 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008300 DPL=0 TSS16-busy
GDT=     000f6a50 00000037
IDT=     000f6a8e 00000000
CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 
DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=66 ba b8 d2 0f 00 e9 a2 fe f3 90 f0 0f ba 2d 04 ff fb 3f 00 <72> f3 8b 25 
00 ff fb 3f e8 d2 65 ff ff c7 05 04 ff fb 3f 00 00 00 00 f4 eb fd fa fc 66 b8

And the failed debug output looks like:

SeaBIOS (version rel-1.8.0-7-gd23eba6-dirty-20150311_121819-morn.localdomain)
[...]
cmos_smp_count0=20
[...]
cmos_smp_count=1
cmos_smp_count2=1/20
Found 1 cpu(s) max supported 20 cpu(s)

I'm going to check the assembly for a compiler error, but is it
possible QEMU is returning incorrect data in cmos index 0x5f?

David, any chance you can recompile seabios and double check your
output?

-Kevin


--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -128,6 +128,7 @@ smp_setup(void)
 
     // Wait for other CPUs to process the SIPI.
     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
+    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
     while (cmos_smp_count != CountCPUs)
         asm volatile(
             // Release lock and allow other processors to use the stack.
@@ -140,6 +141,8 @@ smp_setup(void)
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
     yield();
+    dprintf(1, "cmos_smp_count2=%d/%d\n", cmos_smp_count
+            , rtc_read(CMOS_BIOS_SMP_COUNT) + 1);
 
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;
diff --git a/src/post.c b/src/post.c
index 9ea5620..dc11c72 100644
--- a/src/post.c
+++ b/src/post.c
@@ -170,6 +170,7 @@ platform_hardware_setup(void)
     clock_setup();
 
     // Platform specific setup
+    dprintf(1, "cmos_smp_count0=%d\n", rtc_read(CMOS_BIOS_SMP_COUNT) + 1);
     qemu_platform_setup();
     coreboot_platform_setup();
 }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]