Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega)

avr-gcc-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega)

From:	Georg-Johann Lay
Subject:	Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?
Date:	Sat, 13 Oct 2012 00:27:29 +0200
User-agent:	Thunderbird 2.0.0.24 (Windows/20100228)

David Brown schrieb:

So ideally, the compiler should use direct access when there are two orless accesses to the same structure, and Z+q for three or more accesses(in -Os) or stick with direct access (in -O2). If it first uses Z+q,then Z should be loaded with the lowest accessed address to allow Z modefor a slight speed gain. And multiple adjacent access can be done usingZ+ for faster access.

pre-/post-modify support is actually non-existent. Oleg wants torewrite it but it will be really hard to get the code an assemblerprogrammer does expect. It's too messed up at the time pre-/post-modifyruns.

That's the theory, anyway, as far as I understand it - but I don't knowhow it could be implemented in practice...


That's the theory.

In practice, address registers are a very scarce resource on AVR andloading address to an address register considerably increases registerpressure on them. Notice that there are only two regs for offsettableaddressing, namely Y and Z, and Y might be occupied by the frame pointer.

If pressure is too high you get 16-bit moves to the pointer reg becausethe pointer itself lives in, say, R16.

If so, you are slower and the code is as least as big as with directaddressing.

Problem is that these optimizations run before reload, and after reloadit is merely impossible to fix it.

Moreover, if a pointer reg is a loop invariant and moved outside a loopthe register pressure of the whole loop goes up.

Using indirect addressing is a good thing for machines that have manyaddress registers. Therefore, indirect accesses are generated with theintention that they are optimized away later if direct addressing isavailable.

Pass to do this is one of the propagation passes, but dunno which ofthem. The optimization itself it simple and as it is not carried outit's not unlikely it is intentionally.


typedef struct
{
    unsigned char volatile a;
    unsigned char volatile b;
    unsigned char c;
} S;

void funb (void)
{
  ((S*) 0x0800)->b = 0;
}

void func (void)
{
  ((S*) 0x0800)->c = 0;
}

With this test case you see that func is optimized to direct access bythe combiner, which is odd. It's fine that it optimizes it but methinks it is the wrong place for it (should have happened earlier).

Writing a combine pattern for this would work but is a kludge that isnot wanted and bypasses fixing of the very problem.


Problem is one or more of:

1) Constant costs are 0 in avr.c i.e. loading the constant
   address costs nothing.

2) The cost model does not distinguish between register moves and
   memory moves (see -mlog=rtx_costs).

3) Implicit assumption from other architectures that indirect accesses
   are good.

4) Optimizers are repelled by volatile.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?, (continued)

Prev by Date: Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?
Next by Date: [avr-gcc-list] Porting Atmel patches
Previous by thread: Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?
Next by thread: [avr-gcc-list] Porting Atmel patches
Index(es):
- Date
- Thread