avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs


From: Joern Rennecke
Subject: Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs
Date: Mon, 17 Nov 2014 18:49:21 +0000

On 17 November 2014 17:17, Georg-Johann Lay <address@hidden> wrote:
> as a test case which works both for C and C++.  The C++ front fails in
> applying a similar type demotion like the C front,

Well, duh, that's because it's a different language.

>
>
> Is there a solution that is more reliable than combine?  Currently combine
> appears to be the best candidate.  However there are many constraints that
> must be satisfied for such non-matching split to apply.  Optimization must
> be turned on (it's not only an optimization issue because CBI/SBI are atomic
> whereas IN/OP/OUT sequences are not),

The only way you can get these atomic operations without optimizations is
to request them in the first place, e.g. by adding & using a built-in
function for cbi.


> The combiner must not come up with a CBI-like pattern for this; it would
> mean combine drags a volatile access over an other, and that would be
> incorrect for the same reasons like for why PR51374 was a combine bug.

The fact that combine.c uses init_recog_no_volatile clearly shows that
combine was not designed with operating on volatile operations in mind.
If combine was all fixed up and audited to be volatile safe, we should be
able to use init_recog instead.
The avr port recognizing volatile irrespective of volatile_ok is
playing with fire.
>
>> Another possible approach would be to use a peephole2 pattern or a
>> target-specific
>
>
> peephole2 is even less reliable than combine here :-( because it's rather

peephole2 is more specific, thus it can be used more safely, i.e. not easier to
avoid unwanted transformations.  But by the same token, the desired
transformation
is only performed when the instructions are layed out in an anticipated pattern.

> about data flow than about code flow.  A single (move) insn that has nothing
> to do with the operation and peep2 fails...

You can make peephole2 patterns that anticipate a fixed number and placement of
unrelated instructions.


> This would work for all 8-bit modes, something like
>
> (define_code_iterator some_binop [xor ior and plus minus mult ashift])
>
> (define_split
>   [(set (mem:ALL1 (match_operand:ALL1 0 "any_qi_mem_operand" ""))
>         (some_binop:ALL1 (match_operand:ALL1 1 "nonmemory_operand" "")
>                          (match_operand:ALL1 2 "nonmemory_operand" "")))
>    (clobber (match_operand:ALL1 3 "register_operand"))]
>   ""
>   [(set (match_dup 3)
>         (some_binop:ALL1 (match_dup 1)
>                          (match_dup 2)))
>    (set (match_dup 0)
>         (match_dup 3))]
>
> and then something like 1 == GET_MODE_SIZE (mode) in the predicate.

We started with the premise that the patterns we want to optimize come from
C integer promotion rules.  So we have to watch out for integer modes.
Does the optimization problem also arise frequently enough with non-integer
modes that we'd care to make our code more complicated for that?
Oh, and as mentioned above, we're playing with fire, now you want a bigger
conflagration...

> Unfortunately there is no easy going way to query GCC for the current pass,
> e.g. something like combine_in_progress or combine_completed...

How about (strcmp (current_pass->name, "combine") == 0) ?

If you want to know if the combine pass has run, you can insert
a new pass after it that sets a flag.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]