[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Qemu + CUDA: a new possible way?

From: Blue Swirl
Subject: Re: [Qemu-devel] Qemu + CUDA: a new possible way?
Date: Sat, 6 Jun 2009 00:31:01 +0300

On 6/5/09, OneSoul <address@hidden> wrote:
> Hello all!
>  I'm a Qemu user for a long time and I'm very satisfied by its features
>  of flexibility, power and portability - really a good project!
>  Recently, reading some technical articles over internet, I have
>  discoverd the big potentialities of the CUDA framework in relation to
>  the scientific and graphic computing that takes strong advantage from
>  the most recent GPUs. Someone has used it for password recovery,
>  realtime rendering, etc, with great results.
>  It would be possible to use this technology in the Qemu project to
>  achieve better performance?
>  It could be a significative step for the develop in virtualization
>  technology?
>  Someone, for example, in experimental way, has (re)wrote the md-raid
>  kernel modules using the CUDA framework to accelerate the reed-solomon
>  features... and it seems that works fine.
>  Why not for Qemu or related components?
>  The main question is about the dynamic transaltion engine: can it be
>  modified for this framework?

It should be possible to make a CUDA target for TCG, judging from a
quick look at PTX documentation.

I don't know whether that makes sense from performance point of view,
how much time does PTX compilation and transfer to GPU take? Native
GPU machine code would be faster.

>  Someone says that Qemu is NOT parallelizable... but it seems strange
>  because by definition is "Fast and Portable".
>  Not portable on this framework?
>  Pay attention, the computing on GPU is driven through a kernel module,
>  not directly.

The problem for CPUs is that emulation of atomic operations is costly.
Maybe the native thread synchronization operations in CUDA could help.

If QEMU runs at user space, the user to kernel to GPU switches will
increase latency. At least the dynamic translation code should then
run also on GPU, leaving only the IO device handling to CPU. Obviously
VGA emulation should reside in GPU if possible.

>  What do you think about this draft idea? It's just a proof-of-concept,
>  but I hope to be useful.
>  Any feedback is welcome...

What is the performance of a single execution unit? If you emulate an
x86 system, I'd think you get more cycles to run the emulator in CPU
if that runs at 2 GHz, compared to GPU running only at 500 MHz. Maybe
you could emulate a system with 16384 CPUs @ 500MHz? Even if the
single emulator performance is not great, it may still be attractive
for server farms.

Taking the idea one step further: could the CUDA framework be
virtualized? Though it looks like there are no exceptions or privilege
levels, so CUDA can't run an OS.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]