qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement new TCG target for a


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement new TCG target for aarch64
Date: Thu, 23 May 2013 09:29:34 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

On 05/23/2013 01:18 AM, Claudio Fontana wrote:
> +static inline void patch_reloc(uint8_t *code_ptr, int type,
> +                               tcg_target_long value, tcg_target_long addend)
> +{
> +    switch (type) {
> +    case R_AARCH64_JUMP26:
> +    case R_AARCH64_CALL26:
> +        reloc_pc26(code_ptr, value);
> +        break;
> +    case R_AARCH64_CONDBR19:
> +        reloc_pc19(code_ptr, value);
> +        break;

The addend operand may always be zero atm, but honor it anyway.

> +static inline void tcg_out_ldst_9(TCGContext *s,
> +                                  enum aarch64_ldst_op_data op_data,
> +                                  enum aarch64_ldst_op_type op_type,
> +                                  int rd, int rn, tcg_target_long offset)

Universally, use TCGReg for arguments that must be registers.

> +static inline void tcg_out_movi32(TCGContext *s, int ext, int rd,
> +                                  uint32_t value)
> +{
> +    uint32_t half, base, movk = 0;
> +    if (!value) {
> +        tcg_out_movr(s, ext, rd, TCG_REG_XZR);
> +        return;
> +    }

No real need to special case zero; it's just an extra test slowing down the
compiler.

> +    /* construct halfwords of the immediate with MOVZ with LSL */
> +    /* using MOVZ 0x52800000 | extended reg.. */
> +    base = ext ? 0xd2800000 : 0x52800000;

Why is ext an argument to movi32?  Don't we know just because of the name that
we only case about 32-bit data?  And thus you should always be writing to the
Wn registers, which automatically zero the high bits of the Xn register?

Although honestly, if you wanted to keep "ext", you could just merge the two
movi routines.  For most tcg targets that's what we already do -- the arm port
from whence you copied this is the outlier, because it wanted to add a
condition argument.

> +/* solve the whole ldst problem */
> +static inline void tcg_out_ldst(TCGContext *s, enum aarch64_ldst_op_data 
> data,
> +                                enum aarch64_ldst_op_type type,
> +                                int rd, int rn, tcg_target_long offset)
> +{
> +    if (offset > -256 && offset < 256) {
> +        tcg_out_ldst_9(s, data, type, rd, rn, offset);

Ouch, that's not much room.  You'll overflow that regularly getting to the
various register slots in env.  You really are going to want to be able to make
use of the scaled 12-bit offset.

That said, this is certainly ok for now.

> +/* mov alias implemented with add immediate, useful to move to/from SP */
> +static inline void tcg_out_movr_sp(TCGContext *s, int ext, int rd, int rn)
> +{
> +    /* using ADD 0x11000000 | (ext) | rn << 5 | rd */
> +    unsigned int base = ext ? 0x91000000 : 0x11000000;
> +    tcg_out32(s, base | rn << 5 | rd);
> +}

Any reason not to just make this the move register function?  That is,
assuming there's a real reason you set up that frame pointer...

It's starting to look like "ext" could be set to 0x80000000 (or really a
symbolic alias of that) and written as x | ext everwhere, instead of
conditionals.  At least in most cases.

> +    /* all arguments passed via registers */
> +    tcg_out_movr(s, 1, TCG_REG_X0, TCG_AREG0);
> +    tcg_out_movr(s, 1, TCG_REG_X1, addr_reg);

addr_reg almost certainly needs to be zero-extended for 32-bit guests, easily
done by setting ext = 0 here.

> +        unsigned int bits; bits = 8 * (1 << s_bits) - 1;

  unsigned int bits = ...

> +#else /* !CONFIG_SOFTMMU */
> +    tcg_abort(); /* TODO */
> +#endif

This really is even easier: zero-extend (if needed), add GUEST_BASE (often held
in a reserved register for 64-bit targets), perform the load/store.  And of
course for aarch64, the add can be done via reg+reg addressing.

See e.g. ppc64 for how to conditionally reserve a register containing 
GUEST_BASE.

> +    case INDEX_op_rotl_i64: ext = 1;
> +    case INDEX_op_rotl_i32:     /* same as rotate right by (32 - m) */
> +        if (const_args[2]) {    /* ROR / EXTR Wd, Wm, Wm, 32 - m */
> +            tcg_out_rotl(s, ext, args[0], args[1], args[2]);
> +        } else {
> +            tcg_out_arith(s, ARITH_SUB, ext, args[2], TCG_REG_XZR, args[2]);
> +            tcg_out_shiftrot_reg(s, SRR_ROR, ext, args[0], args[1], args[2]);

You can't clobber the args[2] register here.  You need to use the TMP register.

And fwiw, you can always use ext = 0 for that negation, since we don't care
about the high bits.

> +    case INDEX_op_setcond_i64: ext = 1;
> +    case INDEX_op_setcond_i32:
> +        tcg_out_cmp(s, ext, args[1], args[2]);
> +        tcg_out_cset(s, ext, args[0], args[3]);
> +        break;

ext = 0 for the cset, since the result is always 0/1.

> +static void tcg_target_qemu_prologue(TCGContext *s)
> +{
> +    /* NB: frame sizes are in 16 byte stack units! */
> +    int frame_size_callee_saved, frame_size_tcg_locals;
> +    int r;
> +
> +    /* save pairs             (FP, LR) and (X19, X20) .. (X27, X28) */
> +    frame_size_callee_saved = (1) + (TCG_REG_X28 - TCG_REG_X19) / 2 + 1;
> +
> +    /* frame size requirement for TCG local variables */
> +    frame_size_tcg_locals = TCG_STATIC_CALL_ARGS_SIZE
> +        + CPU_TEMP_BUF_NLONGS * sizeof(long)
> +        + (TCG_TARGET_STACK_ALIGN - 1);
> +    frame_size_tcg_locals &= ~(TCG_TARGET_STACK_ALIGN - 1);
> +    frame_size_tcg_locals /= TCG_TARGET_STACK_ALIGN;
> +
> +    /* push (FP, LR) and update sp */
> +    tcg_out_push_pair(s, TCG_REG_FP, TCG_REG_LR, frame_size_callee_saved);
> +
> +    /* FP -> callee_saved */
> +    tcg_out_movr_sp(s, 1, TCG_REG_FP, TCG_REG_SP);

You initialize FP, but you don't reserve the register, so it's going to get
clobbered.  We don't actually use the frame pointer in the translated code, so
I don't think there's any call to actually initialize it either.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]