qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [edk2] (PAM stuff) reset doesn't work on OVMF + SeaBIOS


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [edk2] (PAM stuff) reset doesn't work on OVMF + SeaBIOS CSM
Date: Mon, 18 Feb 2013 18:12:55 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130108 Thunderbird/10.0.12

On 02/18/13 13:53, David Woodhouse wrote:

> Nevertheless, on my workstation as on yours, we do seem to end up
> executing from the CSM in RAM when we reset. But on my laptop, it
> executes the *ROM* as it should.
>
> This patch 'fixes' it, and I think it might even be correct in itself,
> but I don't think it's a correct fix for the problem we're discussing.
> And I certainly want to know what's different on my laptop that makes it
> work *without* this patch.
>
> Either there's some weirdness with setting the high CS base address, on
> CPU reset. Or perhaps the contents of the memory region at 0xfffffff0
> have *really* been changed along with the sub-1MiB range. Or maybe the
> universe just hates us...

We're ending up in the wrong place, under 1MB (which is consistent with
your "reset the PAMs" patch -- state of PAMs should only matter below
1MB).

I single-stepped qemu-1.3.1 in x86_cpu_reset() /
cpu_x86_load_seg_cache(), and we seem to set the correct base. However
when I pause the VM when it's spinning in the reset loop, and I issue
the following in virsh:

# qemu-monitor-command --domain \
  fw-mixed.g-f18xfce2012121716.e-upstream --hmp --cmd \
  cpu 0

# qemu-monitor-command --domain \
  fw-mixed.g-f18xfce2012121716.e-upstream --hmp --cmd \
  info registers

for EIP and CS I get (from cpu_x86_dump_seg_cache(), in the
"HF_CS64_MASK clear" branch):

EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000623
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 0000f300
CS =f000 000f0000 0000ffff 0000f300
    ^    ^        ^        ^
    |    base     limit    flags
    selector

SS =0000 00000000 0000ffff 0000f300
DS =0000 00000000 0000ffff 0000f300
FS =0000 00000000 0000ffff 0000f300
GS =0000 00000000 0000ffff 0000f300
LDT=0000 00000000 0000ffff 00008200
TR =0000 feffd000 00002088 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

(1) The three high nibbles of CS base are lost.


Furthermore, the flags value is (Intel SDM Vol.3A, 3.4.5):

  1 11  1 0011 00000000
  P DPL S type base 23:16
  ^ ^   ^
  | |   descriptor type (1 == code or data segment, 0 == system segment), 
DESC_S_MASK
  | descriptor privilege level (3 == least privileged)
  segment present, DESC_P_MASK

The "type" field depends on the S bit (here 1 == code/data). 0011b means
(see 3.4.5.1):

  0   0 1 1
  D/C E W A
      C R A
  ^   ^ ^ ^
  |   | | accessed, DESC_A_MASK
  |   | |
  |   | for data: 0=r/o, 1==r/w
  |   | for code: 0==exec/only, 1==exec/read, DESC_R_MASK
  |   |
  |   for data: 1==expand down
  |   for code: 1==conforming
  |
  0 == data, 1 == code, DESC_CS_MASK

The type dumped by "info registers" is "data segment, expand up,
read/write, accessed".

I believe the D/C bit (bit 11) should be set, and then 1011b would mean
"code segment, non-conforming, exec/read, accessed".

(2) x86_cpu_reset() does pass DESC_CS_MASK for R_CS, but it doesn't seem
to be present in the dumped value.


I have no idea what's going on, but vmx_set_segment() in the kernel has
a bunch of hacks for CS && selector == 0xf000 && base == 0xffff0000, and
it seems to be host processor dependent. Eg. from commit b246dd5d:

        /*
         * Fix segments for real mode guest in hosts that don't have
         * "unrestricted_mode" or it was disabled.
         * This is done to allow migration of the guests from hosts with
         * unrestricted guest like Westmere to older host that don't have
         * unrestricted guest like Nehelem.
         */
        if (vmx->rmode.vm86_active) {
                switch (seg) {
                case VCPU_SREG_CS:
                        vmcs_write32(GUEST_CS_AR_BYTES, 0xf3);
                        vmcs_write32(GUEST_CS_LIMIT, 0xffff);
                        if (vmcs_readl(GUEST_CS_BASE) == 0xffff0000)
                                vmcs_writel(GUEST_CS_BASE, 0xf0000);
                        vmcs_write16(GUEST_CS_SELECTOR,
                                     vmcs_readl(GUEST_CS_BASE) >> 4);
                        break;

Also in init_vmcb() [arch/x86/kvm/svm.c] I can see (from commit
d92899a0):

        /*
         * cs.base should really be 0xffff0000, but vmx can't handle that, so
         * be consistent with it.
         *
         * Replace when we have real mode working for vmx.
         */
        save->cs.base = 0xf0000;

Going back to vmx, vmx_vcpu_reset() [arch/x86/kvm/vmx.c]:

        /*
         * GUEST_CS_BASE should really be 0xffff0000, but VT vm86 mode
         * insists on having GUEST_CS_BASE == GUEST_CS_SELECTOR << 4.  Sigh.
         */
        if (kvm_vcpu_is_bsp(&vmx->vcpu)) {
                vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
                vmcs_writel(GUEST_CS_BASE, 0x000f0000);
        } else {
                vmcs_write16(GUEST_CS_SELECTOR, vmx->vcpu.arch.sipi_vector << 
8);
                vmcs_writel(GUEST_CS_BASE, vmx->vcpu.arch.sipi_vector << 12);
        }

The leading comment and the main logic date back to commit 6aa8b732
([PATCH] kvm: userspace interface).

(3) I wanted to ask you whether your laptop CPU is "more modern" than
your workstation CPU, but from your other email I guess they're indeed
different.

Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]