qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Streamlining endian handling in TCG


From: Edgar E. Iglesias
Subject: Re: [Qemu-devel] [RFC] Streamlining endian handling in TCG
Date: Wed, 28 Aug 2013 22:42:39 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Aug 28, 2013 at 08:26:43AM -0700, Richard Henderson wrote:
> On 08/28/2013 07:34 AM, Peter Maydell wrote:
> > On 28 August 2013 15:31, Richard Henderson <address@hidden> wrote:
> >> On 08/28/2013 01:15 AM, Peter Maydell wrote:
> >>> [*] not impossible, we already do something on the ppc
> >>> that's similar; however I'd really want to take the time to
> >>> figure out how to do endianness swapping "properly"
> >>> and what qemu does currently before messing with it.
> >>
> >> I've got a loose plan in my head for how to clean up handling of
> >> reverse-endian load/store instructions at both the translator and
> >> tcg backend levels.
> > 
> > Nice. Will it allow us to get rid of TARGET_WORDS_BIGENDIAN?
> 
> I don't know, as I don't know off-hand what all that implies.
> 
> Let me lay out my idea and see what you think:
> 
> Currently, at the TCG level we have 8 qemu_ld* opcodes, and 4 qemu_st* 
> opcodes,
> that always produce target_ulong sized results, and always in the guest
> declared endianness.
> 
> There are several problems I want to address:
> 
> (1) I want explicit _i32 and _i64 sizes for the loads and stores.  This will
> clean up a number of places in several translators where we have to load to 
> _tl
> and then truncate or extend to an explicit size.
> 
> (2) I want explicit endianness for the loads and stores.  E.g. when a sparc
> guest does a byte-swapped store, there's little point in doing two offsetting
> bswaps to make that happen.
> 
> (3) For hosts that do not support byte-swapped loads and stores themselves, 
> the
> need to allocate extra registers during the memory operation in order to  hold
> the swapped results is an unnecessary burden.  Better to expose the bswap
> operation at the tcg opcode level and let normal register allocation happen.
> 
> Now, naively implementing 1 and 2 would result in 32 opcodes for qemu_ld*. 
> That
> is obviously a non-starter.  However, the very first thing that each tcg
> backend does is map the current 8 opcodes into a bitmask ("opc" and "s_bits"
> in the source).  Let us make that official, and then extend it.


Hi,

I like what you propose aswell. A question, some archs have an endian swap
controlled via the MMU, e.g per page selectable (some PPC, microblaze and
maybe others). AFAIK the behaviour is implementable in QEMU today but not
very efficiently. Any thoughts/ideas on this?

Best regards,
Edgar



reply via email to

[Prev in Thread] Current Thread [Next in Thread]