qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Automatic generation of code-generator components (RET


From: Blue Swirl
Subject: Re: [Qemu-devel] Automatic generation of code-generator components (RETRY)
Date: Tue, 20 Jul 2010 22:14:52 +0000

On Mon, Jul 19, 2010 at 2:54 PM, Eliot Moss <address@hidden> wrote:
> Dear developers -- I've seen no responses yet.  My proposal
> is due in early August, so if anyone has feelings on this or
> comments about it, please respond soon :-) ...
>
> Thank you for your patience -- Eliot moss
>
> -------- Original Message --------
> Subject: [Qemu-devel] Automatic generation of code-generator components
> Date: Tue, 13 Jul 2010 14:09:56 -0400
> From: Eliot Moss <address@hidden>
> Reply-To: address@hidden
> To: address@hidden
>
> Dear QEMU developers --
>
> I have had some email conversation with a few active developers,
> and with their encouragement, want to open it up for the whole list
> to comment.
>
> For several years my research group at UMass has been developing
> generic code-generator generator (CGG) technology. Historic CGGs
> have always been tied to a particular code-generation framework,
> that is, to a particular intermediate representation (IR) and
> compiler.  Our tool, called GIST (for Generator of Instruction
> Selectors Tool), is designed to work from any reasonable IR and
> to connect to any reasonable framework.  More technical details
> below, but what we are hoping for is to be able to say that if
> we make this industrial-strength with some funding from the
> National Science Foundation, the QEMU community will be interested
> in using it.  No commitment -- just that you think it *might*
> be a good idea if we can make it go.  We would use QEMU as one
> of our "demo" environments.

If you keep the development process open, with patches, RFCs and other
proposals flowing regularly to upstream QEMU, I'd suppose the
developer community would welcome this. But there have been several
forks of QEMU and merging those looks hard. The problem with those was
that the development was kept behind a curtain until it was 'finished'
according to some internal standards and then it was presented to us
as something that should just be swallowed in whole. In the continuous
cooperation model the result will be seamless. It's also possible that
there's too much mismatch between the goals of the projects, but with
early involvement of both sides, it can be detected before it's too
late.

> Ok, more details.  We have an architecture description language
> called CISL (CoGenT Instruction Set Language; CoGenT is our overall
> project's name).  It is somewhat like C or Java in appearance. You
> define the various memories and registers, and the instructions.
> To generate an instruction selector from input ISA A (generally
> a compiler IR, but not necessarily) to output ISA B (generally,
> but not always, a hardware architecture), you start with descriptions
> of A and B in CISL -- some of which may already be around.  You
> also write what we call a *mapping* from A to B, which simply
> indicates where on B each memory/register of A should go.  The
> tool then finds instruction selector patterns, at least one for
> each instruction of the A machine.
>
> For any given retargetable *framework* (compiler, interpreter, emulator),
> we write one *adapter*, that knows how to take GIST patterns in their
> internal form and write them out in the way that the framework
> needs them.
>
> Here's an example.  Suppose we are going from A = QEMU IR to
> B = MIPS, that is, the same as the TCG back end for an emulator
> running on the MIPS processor.  We have written a CISL
> description for the QEMU IR (yes, already), and suppose we have
> one for the MIPS, sufficient for code generation anyway.
> [Side note: Compilers do not generally use every instruction
> of their target, e.g., not the privileged mode ones, etc.
> Also, in the presence of register allocation, they generally
> target a slightly virtualized machine -- one with a huge
> number of registers, which register allocation then resolves
> to real registers and occasional spilled locations.]
>
> The mapping would talk about how to find QEMU memory on the
> MIPS (perhaps a dedicated base register), etc., and would
> also capture the conventions for calling helper routines,
> and so on.
>
> The adapter for QEMU TCG back ends would generate something
> like a C switch statement with one case for each QEMU IR
> instruction. Each case might have some additional case
> analysis. This is because (as you see in QEMU), a given
> IR instruction can have special cases depending on values
> of constants, whether something is in a register, etc.
> GIST will have found different patterns for each of these,
> and with each one there would be a *constraint*, indicating
> when it applies.  For example, patterns for adding a constant
> value on the MIPS would likely have a special case for
> constants that fit in 16 bits, since then you can use one
> immediate instruction.  Likewise, the constant 0 is a
> special case since it can just be a move.  In addition
> to constraints, patterns have costs, which one can develop
> for any given target, but would typically be based on
> number of instructions, number of instruction bytes,
> number of memory references, etc. Thus the case analysis
> for a given instruction would check for the lowest cost
> patterns first, and would conclude with the most general
> pattern (but which may be the most expensive).
>
> The adapter would also need to generate the information
> needed by the QEMU TCG register allocator.

I think this case is interesting for the currently unimplemented
hosts, or if the performance is improved from existing hosts compared
to TCG. The extra optimization pass may waste time which will not be
gained back by faster execution time of the better generated code.
There's also KVM, which should be always the fastest for cases where
host and target CPUs match.

>
> Now, here are some things of additional interest:
>
> - While QEMU IR -> emulation host code-generation is maybe
>  the most obvious case, we can also handle the "front end"
>  emulation target -> QEMU IR generation.  This probably
>  requires a slightly different description of machine A
>  than when A is the emulation host -- after all, we must
>  handle *all* instructions, including privileged ones,
>  etc.  But it is possible to make the descriptions
>  modular in such a way that instructions used in both
>  cases are not repeated.

Again, current implementations basically consist of just a huge C
switch statement, so it's hard to improve performance from that.
Enabling new targets would be interesting though, or maybe one day an
automatically generated translator could beat a hand written one.

> - I noticed that someone is looking at interpretation
>  rather than compilation.  We have seen that we can generate
>  functional simulators (very close to emulators) from
>  CISL descriptions.  Thus, it would be possible to generate
>  a simulator for any of the machines of interest.  What
>  QEMU provides is a framework with all the memory and
>  device modeling, etc.
>
> - An approach like this might make it easier to maintain a
>  range of different models of the same ISA.  It might also
>  facilitate moving towards multicore emulation, maybe even
>  heterogeneous multicore.  It would also make it easier to
>  change around how the simulated memory is organized and
>  accessed, if that would be helpful.
>
> - It would make it particularly simple to build an emulator for
>  a new or extended machine.  Of course you still need a compiler
>  for it, but we can use the same description to generate a C
>  compiler, etc.
>
> This would be a several year long project, with real support ($$)
> for three or more years.  The goal is for GIST to have its own
> self-sustaining open-source community after that.  We are in
> conversation with some other software communities of interest
> concerning whether they would also be in favor of the project.
> These include the Jikes RVM Java Virtual Machine project
> (both the optimizing and the non-optimizing compilers), another
> compiler framework, and a simulator framework.
>
> I look forward to your thoughts, questions, and reactions.
>
> Regards -- Eliot Moss
> ==============================================================================
> J. Eliot B. Moss, Professor               http://www.cs.umass.edu/~moss
>  www
> Director, Arch. and Lang. Impl. Lab.      +1-413-545-4206
>  voice
> Department of Computer Science            +1-413-695-4226
> cell
> 140 Governor's Drive, Room 372            +1-413-545-1249
>  fax
> University of Massachusetts at Amherst   address@hidden
>  email
> Amherst, MA  01003-9264  USA              +1-413-545-2746 Laurie Downey
>  sec'y
> ==============================================================================
>
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]