qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] QEMU, self-modifying code, and Windows 7 64-bit (no KVM)


From: Hulin, Patrick - 0559 - MITLL
Subject: [Qemu-devel] QEMU, self-modifying code, and Windows 7 64-bit (no KVM)
Date: Wed, 13 Aug 2014 18:36:44 +0000

Hi QEMU devs,

QEMU 2.10 does not currently run Windows 7 64-bit without KVM. There have been 
a few threads about this over the past few years (such as 
https://bugs.launchpad.net/qemu/+bug/921208 and 
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02603.html), but the 
problem was never resolved. I think I've identified the cause, but I am not 
sure what the correct way to fix it is. I'm working on PANDA, a set of analysis 
extensions to QEMU (github.com/moyix/panda) and I'd really like to be able to 
use our analyses on Windows 7 64-bit.

There are two issues right now. The first is that QEMU is missing a CPUID bit 
(for debug extensions, CPUID_DE) because the feature isn't implemented in QEMU. 
This can easily be hacked around by just enabling the bit, but I imagine you 
all aren't excited about advertising features that don't exist. The second 
issue is that both the installer and the OS itself fail with blue screens of 
DRIVER_IRQL_NOT_LESS_OR_EQUAL or KMODE_EXCEPTION_NOT_HANDLED (due to illegal 
instruction). This is a little trickier.

One of the major differences between Windows 7 x86 and x64 is that the 64-bit 
version has Microsoft's Kernel Patch Protection, aka PatchGuard. In order to 
protect itself, PatchGuard lives encrypted in memory and follows a two-stage 
decryption process. The process begins with a series of xor's which 
successively decrypt the PatchGuard code. This is self-modifying code (in 
particular, the first xor overwrites itself and the next instruction).

For the uninitiated, as I understand it, QEMU's self-modifying code support 
works in the following way. Before executing a translation block, QEMU 
write-protects (using host MMU features) the _host_ page that contains the 
section of guest memory on which the guest TB code lives. When self-modifying 
code attempts to write to that page, it triggers a host segmentation fault. 
QEMU then catches this segmentation fault using standard POSIX signal 
infrastructure. Once caught it walks into the software MMU code. If the write 
intersects the current TB, QEMU splits the TB into two: the single instruction 
that is being executed and the rest of the block, which is invalidated so it 
will be retranslated as soon as QEMU tries to run it. QEMU then restores the 
pre-write CPU state (cpu_restore_state) and longjmp's out 
(cpu_resume_from_signal). The instruction then executes again, and this time it 
actually makes the write to QEMU's memory state. QEMU translates the new code, 
which is now in its own TB, and continues from there.

In this case, the write is 8 bytes and unaligned, so it gets split into 8 
single-byte writes. In stock QEMU, these writes are done in reverse order (see 
the loop in softmmu_template.h, line 402). The third decryption xor from Kernel 
Patch Protection should hit 4 bytes that are in the current TB and 4 bytes in 
the TB afterwards in linear order. Since this happens in reverse order, and the 
last 4 bytes of the write do not intersect the current TB, those writes happen 
successfully and QEMU's memory is modified. The 4th byte in linear order (the 
5th in temporal order) then triggers the current_tb_modified flag and 
cpu_restore_state, longjmp'ing out. However, cpu_restore_state only goes back 
to right before that byte is written, so the last 4 bytes—the ones off the 
current TB—have been modified. QEMU then invalidates, retranslates, and runs 
the xor again. This successfully decrypts the 4 bytes inside the current TB, 
but because the write to the last 4 bytes was not reversed as it should have 
been, those bytes get xor'd a second time. Effectively, QEMU mistakenly 
re-encrypts those bytes. Once the code is incorrect, inaccuracies build up 
until something blue screen-able happens (in this case, an illegal instruction 
or various kinds of bad accesses).

I am not sure how to fix this issue. For now, in our tool, PANDA, we have just 
reversed the order of the loop. But that change will fail in any situation in 
which the write happens off the front end of the TB and then the self-modifying 
code loops back to the previous TB. This modification enables Windows 7 x64 to 
run successfully without KVM, which is all we really need for our purposes.

I looked back in the commit history for this area of the code. It looks like 
the order of the loop was changed from forwards to backwards back in 2007 by 
the following two commits:

commit 6c41b2723f5cac6e62e68925e7a73f30b11a7a06
Author: balrog <address@hidden>
Date:   Sat Nov 17 12:12:29 2007 +0000
    Don't compare '\0' against pointers.
    Add a note from Fabrice in slow_st template.
        
    git-svn-id: svn://svn.savannah.nongnu.org/qemu/address@hidden 
c046a42c-6fe2-441c-8c8c-71466251a162
 
commit 7221fa98d381a19b8809979934554644381fb88c
Author: balrog <address@hidden>
Date:   Sat Nov 17 09:53:42 2007 +0000
    Check permissions for the last byte first in unaligned slow_st accesses 
(patch from TeLeMan).
    
    git-svn-id: svn://svn.savannah.nongnu.org/qemu/address@hidden 
c046a42c-6fe2-441c-8c8c-71466251a162

The relevant qemu-devel thread is here: 
https://lists.gnu.org/archive/html/qemu-devel/2007-10/msg00646.html. It looks 
like the author was trying to fix a page boundary bug where the write was off 
the front of the write-protected page and would happen twice, just as in this 
case. Unfortunately, the "fix" just moved the problem to a different case. 
Fabrice commented on that patch in this thread: 
https://lists.gnu.org/archive/html/qemu-devel/2007-11/msg00538.html, saying 
that the reverse-order code would work across forward page boundaries, 
essentially by chance. Unfortunately, it caused the code to fail on forward TB 
boundaries.

If it's not too complicated, I'd like to contribute an actual fix back 
upstream. I don't understand the MMU code completely, so if I've gotten 
anything wrong please correct me. As I see it, there are two options, neither 
of which seem too easy under the current control flow:

- Make sure cpu_restore_state goes all the way back to the beginning of the 
stq, and not just the most recent stb.
- Specifically check to see if an stq intersects the current TB before 
splitting it into the 8 stb's. 

There are probably others though. Thoughts? Questions? It would be really 
awesome to get a real fix for this bug.

P.S. Windows 8 x64 still fails, even after my forward-loop patch. I'm working 
on debugging that too.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]