[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega)

From: Georg-Johann Lay
Subject: Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?
Date: Sat, 13 Oct 2012 00:27:29 +0200
User-agent: Thunderbird (Windows/20100228)

David Brown schrieb:
So ideally, the compiler should use direct access when there are two or less accesses to the same structure, and Z+q for three or more accesses (in -Os) or stick with direct access (in -O2). If it first uses Z+q, then Z should be loaded with the lowest accessed address to allow Z mode for a slight speed gain. And multiple adjacent access can be done using Z+ for faster access.

pre-/post-modify support is actually non-existent. Oleg wants to rewrite it but it will be really hard to get the code an assembler programmer does expect. It's too messed up at the time pre-/post-modify runs.

That's the theory, anyway, as far as I understand it - but I don't know how it could be implemented in practice...

That's the theory.

In practice, address registers are a very scarce resource on AVR and loading address to an address register considerably increases register pressure on them. Notice that there are only two regs for offsettable addressing, namely Y and Z, and Y might be occupied by the frame pointer.

If pressure is too high you get 16-bit moves to the pointer reg because the pointer itself lives in, say, R16.

If so, you are slower and the code is as least as big as with direct addressing.

Problem is that these optimizations run before reload, and after reload it is merely impossible to fix it.

Moreover, if a pointer reg is a loop invariant and moved outside a loop the register pressure of the whole loop goes up.

Using indirect addressing is a good thing for machines that have many address registers. Therefore, indirect accesses are generated with the intention that they are optimized away later if direct addressing is available.

Pass to do this is one of the propagation passes, but dunno which of them. The optimization itself it simple and as it is not carried out it's not unlikely it is intentionally.

typedef struct
    unsigned char volatile a;
    unsigned char volatile b;
    unsigned char c;
} S;

void funb (void)
  ((S*) 0x0800)->b = 0;

void func (void)
  ((S*) 0x0800)->c = 0;

With this test case you see that func is optimized to direct access by the combiner, which is odd. It's fine that it optimizes it but me thinks it is the wrong place for it (should have happened earlier).

Writing a combine pattern for this would work but is a kludge that is not wanted and bypasses fixing of the very problem.

Problem is one or more of:

1) Constant costs are 0 in avr.c i.e. loading the constant
   address costs nothing.

2) The cost model does not distinguish between register moves and
   memory moves (see -mlog=rtx_costs).

3) Implicit assumption from other architectures that indirect accesses
   are good.

4) Optimizers are repelled by volatile.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]