[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Bison lexer
From: |
Hans Åberg |
Subject: |
Bison lexer |
Date: |
Wed, 29 Aug 2018 15:56:38 +0200 |
> On 29 Aug 2018, at 00:31, Frank Heckenbach <address@hidden> wrote:
>
> Hans Åberg wrote:
>
>>> On 27 Aug 2018, at 22:10, Akim Demaille <address@hidden> wrote:
>>>
>>>> Most of my porting work, apart from writing the new skeletons, was
>>>> general grammar cleanup and conversion of semantic types from raw
>>>> pointers and containers to smart pointers and other RAII classes
>>>> (which was my main goal of the port, of course), and changes in the
>>>> lexer (dropping flex, but that's another story).
>>>
>>> I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
>>> I have one parser here,
>>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
>>> and another there
>>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
>>> for instance, using Flex.
>>
>> That is probably versions before 2.6; the yyin and yyout have been
>> changed in the C++ header so that they are no longer pointers, so
>> it is not only incompatible with the header of older versions, but
>> also with the code it writes, resulting in the issue [1].
>>
>> 1.
>> https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error
>
> Though this wasn't actually my problem, I'll reply to this mail
> rather than the main thraed to keep it separate from the actual
> Bison discussion.
One can change the subject. :-)
> For a start, I didn't have very good experience communicating with
> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
> etc. in the generated code, so over the years I'd been adjusting
> various warning-suppression gcc options or doing dirty #define
> tricks to avoid warnings, or sometimes even post-processing the
> generated lexer with sed.
GCC 8.2 uses C17 as default.
> But the final straw was when, after changing to C++ Bison, I wanted
> to switch to C++ Flex too and found this beautiful comment:
>
> /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
> * following macro. This is required in order to pass the
> c++-multiple-scanners
> * test in the regression suite. We get reports that it breaks inheritance.
> * We will address this in a future release of flex, or omit the C++
> scanner
> * altogether. */
It has been like that since the 1990s, I believe.
> I know there are no guarantees in the future of free software
> (neither of non-free software, of course), but such an
> announcement/threat seemed too risky to me.
Indeed, it seems broken now.
> Meanwhile I'd often thought that all Flex actually does is matching
> alternative regular expressions. Plain RE can do that as well, and
> by capturing subexpressions I can find out which alternative was
> matched.
>
> Of course, it would (indeed turn out to be) somewhat slower (RE
> built at runtime vs. compile time), but like parsing, lexing speed
> is not a big issue to me. So I was ready to trade that in for
> convenience of programming and one less dependence on a problematic
> tool.
>
> (Side node: Many years ago, on a different project, I dropped gperf
> to recognize predefined identifiers for similar reasons, and put
> them in a look-up table instead. Except for a tiny slowdown, that
> had worked out well, so I was confident I could drop Flex, too. --
> Now apparently the next one in line after dropping gperf and Flex
> should be Bison, but don't worry, I don't see an easy way to replace
> it, since Bison actually does some nontrivial stuff. :)
>
> So I wrote a small library that builds that massive RE out of single
> rules and maps subexpressions back to rules (even in the case that
> rules contain subexpressions of their own), and that works for me.
I did that, too: I wrote some DFA/NFA code, and incidentally found the most
efficient method make action matches via a reverse NFA lookup, cf. [1-3]. Also,
I have made UTF-8/32 to octet character class translations.
1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
- Re: Bison C++ mid-rule value lost with variants, Akim Demaille, 2018/08/12
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/26
- Re: Bison C++ mid-rule value lost with variants, Akim Demaille, 2018/08/27
- Re: Bison C++ mid-rule value lost with variants, Hans Åberg, 2018/08/27
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/28
- Bison lexer,
Hans Åberg <=
- Re: Bison lexer, Frank Heckenbach, 2018/08/31
- Re: Bison lexer, Hans Åberg, 2018/08/31
- Re: Bison lexer, Frank Heckenbach, 2018/08/31
- Re: Bison lexer, Hans Åberg, 2018/08/31
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/28