qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] How to make shadow memory for a process? and how to tra


From: Lluís
Subject: Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?
Date: Tue, 16 Nov 2010 14:49:27 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

F Zhang writes:

> This topic includes things that I recognized as critical. Have you any
> suggestions?

Sorry, I don't understand about what you want suggestions.


>>>>  Yes. For each process’s memory space A, I wanna make a shadow memory B. 
>>>> The
>>>>  shadow memory is used to store the tag of data. In other words, if addr in
>>>>  memory A is tainted, then the corresponding byte in B should be marked to
>>>>  indicate that addr in A is tainted.
>> 
>> The main question here is... what is the granularity that you want to
>> track with? Bytes? Words? Pages? This will greatly influence which is
>> your best approach.

> I think one byte per tag is necessary for malware analysis in most cases,
> because only a few bytes are used to launch an attack. For example, a few
> tainted bytes sent to EIP register will cause CPU to do bad things.
[...]
>> Now that I think of it, you could use the tracing points I sent for
>> guest virtual memory accesses, and instrument them instead of calling a
>> file-tracing backend (this should provide a hook for an arbitrary
>> granularity). Then, simply keep track also of address-space changes and
>> your instrumentation code can always know when to activate propagation.
>> 

> Sorry, what is “a file-tracing backend”? Could you be a little more detailed? 
> I
> think I need byte-level granularity. Thanks!

Well, the initial patch series I sent were based on macros, so that you
can place any code you want on these macros, not only tracing.

On its current form (sorry, I don't have spare time right now to finish
it), the points generate code for tracing, but there is a patch series
that lets the user re-define some of the trace points to call any
function provided by the user (look for the "trace-instrument" series).


>> This, together with the optimization I sent for dynamic control of trace
>> generation in TCG emulation code should get you on tracks.
>> 
>> Of course, you should still modify all register-accessing instructions
>> to propagate information passing through the register set. For that,
>> maybe you could start with the "fetch" tracing/instrumentation point I
>> sent long time ago, which keeps track of general-purpose register
>> usage/definition on x86 (although I'm sure I left some astray usages due
>> to the decoding complexity in x86).

> Thanks! I will read that code first, though I am currently just a newbie. L

>>>>  The guest os collects “higher” semantic
>>>>  from the OS level, and the QEMU collects “lower” semantic from the
>>>>  instruction level. Combination of both semantics is necessary in the
>>>>  analysis process.
>> 
>>>  The question is, in a situation where malware already compromise "the
>>>  higher semantic", could we trust the analysis?
>> 
>> Beware, I've read exactly this kind of scheme on previous top-tier
>> conferences (but I think tests were using an architectural simulator, so
>> it's not for a current production environment).
>> 
>> I've found it :)
>> 
>>      Secure program execution via dynamic information flow tracking
>>      ASPLOS 2004
>> 
> That is a significant paper, which is cited for more than 300 times!

That's why I said you should be careful. Porting this kind of analysis
into QEMU is not significant by itself, although I suppose it should
gain some extra relevance if you implement it in such a way that it can
be used on a production system.

You could start with guest OS taint propagation, and through the "guest
OS to QEMU" channel, activate taint propagation checks when a process
gains access to tainted information coming from the outer world (e.g.,
socket read) [*]. Then, you can conditionally generate taint checks like
I did in the "trace-gen" series, so that programs without access to
tainted information will have no checks at all.

Even more, the optimal solution would be to run in KVM-mode when no
instruction-based taint checking is needed, and use QEMU emulation
otherwise. The down side is that I was told this is not currently
possible with multiple CPUs, and only theoretically possible with one
CPU.

[*] This is just a rough summary of what I remember from the ASPLOS
    paper


>>>>  The question is: how to communicate between the QEMU and the guest OS, so
>>>>  that they can cooperate with each other?
>> 
>> A few choices here, but you should first define if the communication
>> must be based just on control signals, and/or providing memory storage:
>>   * virtual device : If you need some kind of storage that the guest OS
>>     must access, you could look at the ivshmem device
>>   * backdoor instruction : It's the simplest option; I sent some patch
>>     series recently with two different implementations for x86.
>> 
>> 

> Both of control signals and (shadow) memory storage are required. So, the
> virtual device may be the right choice.

Shadow memory is not a problem here, as it can be handled by the
intrumented trace points and the guest OS has no need to access it. So
from my understanding, just using an instruction-based backdoor is
sufficient for the guest OS to tell QEMU when taint analysis propagation
must be performed, and on which memory addresses it must start
propagating.


> In this year’s top security conferences (Oakland, CCS, Usenix Security, NDSS 
> and
> so on), many works are based on virtual technology. So I think QEMU is a good
> choice for future academic research. 

I'm not much of a security expert, so I don't know what's the current
state-of-the-art, but if you go on board on the journey of implementing
this in QEMU, first make sure you can provide novel ideas and features
on top of this infrastructure.

Coding this kind of things is fun, but if it's just for the sake of
coding, this won't get you publications (I remember reading that you are
doing a PhD); believe me, I've gone through this before :)

Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



reply via email to

[Prev in Thread] Current Thread [Next in Thread]