[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 0/2] arm: Implement M-profile trapping on division by zero

From: Peter Maydell
Subject: [PATCH 0/2] arm: Implement M-profile trapping on division by zero
Date: Fri, 30 Jul 2021 16:16:34 +0100

Unlike A-profile, for M-profile the UDIV and SDIV insns can be
configured to raise an exception on division by zero, using the CCR
DIV_0_TRP bit.  This patchset implements that missing functionality
by having the udiv and sdiv helpers raise an exception if needed.

Some questions:

Is it worth allowing A-profile to retain the mildly better codegen it
gets from not having to pass in 'env' and marking the helper as
no-side-effects (ie having M-specific udiv/sdiv helpers) ?

Is it worth inlining either udiv or sdiv for the A-profile case?
udiv can be done with movcond/movcond/divu, something like:

    /* t1 = (t2 == 0) ? 0 : t1;    t2 = (t2 == 0) ? 1 : t2 */
    tcg_gen_movcond_i32(TCG_COND_EQ, t1, t2, tcg_constant_i32(0),
    tcg_constant_i32(0), t1);
    tcg_gen_movcond_i32(TCG_COND_EQ, t2, t2, tcg_constant_i32(0),
    tcg_constant_i32(1), t2);
    /* Either t1 / t2; or 0 / 1 to give 0 for division-by-zero */
    tcg_gen_divu_i32(t1, t1, t2);

sdiv is more painful because it needs to check for both x/0 and
INTMIN/-1 cases.  Some other targets choose to generate inline TCG
ops for it, though.

Side note, I don't understand the x86-64 codegen for the above
sketch of an inline udiv. When I try it the TCG ops are

  mov_i32 tmp3,r2
  mov_i32 tmp6,r3
  movcond_i32 tmp3,tmp6,$0x0,$0x0,tmp3,eq
  movcond_i32 tmp6,tmp6,$0x0,$0x1,tmp6,eq
  mov_i32 tmp7,$0x0
  divu2_i32 tmp3,tmp7,tmp3,tmp7,tmp6
  mov_i32 r3,tmp3

but the x86 code is
0x7f5f1807dc0c:  45 33 f6                 xorl     %r14d, %r14d
0x7f5f1807dc0f:  45 85 ed                 testl    %r13d, %r13d
0x7f5f1807dc12:  45 0f 44 e6              cmovel   %r14d, %r12d
0x7f5f1807dc16:  41 bf 01 00 00 00        movl     $1, %r15d
0x7f5f1807dc1c:  45 3b ee                 cmpl     %r14d, %r13d
0x7f5f1807dc1f:  45 0f 44 ef              cmovel   %r15d, %r13d
0x7f5f1807dc23:  41 8b c4                 movl     %r12d, %eax
0x7f5f1807dc26:  41 8b d6                 movl     %r14d, %edx
0x7f5f1807dc29:  41 f7 f5                 divl     %r13d

where the comparison for the first cmovel is 'testl %r13d, %r13d",
but the second comparison is 'cmpl %r14d, %r13d'.  That's the same
effect (given r14 is 0) but I don't understand why the backend has
chosen to generate different code for the two cases.  (Ideally of
course it would notice that it already had generated the condition
check and not repeat it.)

-- PMM

Peter Maydell (2):
  target/arm: Re-indent sdiv and udiv helpers
  target/arm: Implement M-profile trapping on division by zero

 target/arm/cpu.h       |  1 +
 target/arm/helper.h    |  4 ++--
 target/arm/helper.c    | 34 ++++++++++++++++++++++++++--------
 target/arm/m_helper.c  |  4 ++++
 target/arm/translate.c |  4 ++--
 5 files changed, 35 insertions(+), 12 deletions(-)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]