bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clang-built Gawk 5.2.1 regex oddity


From: Paul Eggert
Subject: Re: Clang-built Gawk 5.2.1 regex oddity
Date: Sun, 1 Jan 2023 22:10:28 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

This is a serious bug in Clang: it generates incorrect machine code.

The code that Clang generates for the following (gawk/support/dfa.c lines 1141-1143):

    ((dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR
      ? dfaerror : dfawarn)
     (_("character class syntax is [[:space:]], not [:space:]")));

is immediately followed by the code generated for the following (gawk/support/dfa.c line 1015):

                    dfaerror (_("invalid character class"));

and this is incorrect because the two source code regions are not connected with each other.

You can see the bug in the attached (compressed) file dfa.s which contains the assembly language output. Here's the dfa.s file starting with line 6975:

  6975          testb   $4, 456(%r12)
  6976          movl    $dfawarn, %eax
  6977          movl    $dfaerror, %ebx
  6978          cmoveq  %rax, %rbx
  6979          movl    $.L.str.26, %esi
  6980          xorl    %edi, %edi
  6981          movl    $5, %edx
  6982          callq   dcgettext
  6983          movq    %rax, %rdi
  6984          callq   *%rbx
  6985  .LBB34_144:
  6986          movl    $.L.str.25, %esi
  6987          xorl    %edi, %edi
  6988          movl    $5, %edx
  6989          callq   dcgettext
  6990          movq    %rax, %rdi
  6991          callq   dfaerror

Line 6984, which is source lines 1141-1143 call to either dfaerror or dfawarn, is immediately followed by the code for source line 1015. This means that at runtime when dfawarn returns the code immediately calls dfaerror, which is incorrect.

My guess is that Clang got confused because dfaerror is declared _Noreturn, so Clang mistakenly assumed that dfawarn is also _Noreturn, which it is not.

I worked around the Clang bug by installed the attached patch into Gnulib. Please give it a try with Gawk.

Incorrect code generation is a serious bug in Clang; can you please report it to the Clang folks? I am considering using a bigger hammer, and doing this:

   #define _Noreturn /*empty*/

whenever Clang is used, until the bug is fixed.

This is because if the bug occurs here it's likely that similar bugs will occur elsewhere and this sort of thing can be really subtle and hard to catch or work around in general. Clang really needs to get this fixed.

Thanks.

Attachment: dfa.s.gz
Description: application/gzip

Attachment: 0001-dfa-work-around-Clang-15-bug.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]