[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Which lexer do people use?
From: |
Adrian Vogelsgesang |
Subject: |
Re: Which lexer do people use? |
Date: |
Sat, 4 Jul 2020 19:30:36 +0000 |
User-agent: |
Microsoft-MacOutlook/10.10.17.200615 |
Hi Daniele,
> Which other scanners do people use?
For what it’s worth, we are using a hand-rolled scanner. Seemed just the
fastest way to get rolling and the easiest to maintain.
Also, it allowed us to embed a few hacks directly inside the scanner: E.g. in a
few places our grammar is not actually LR1. Only in very few edge cases,
though, so that we don’t want to use GLR. Hence, our scanner does a lookahead
and, e.g., upon encountering the token “WITH” looks at the following token. If
the next token is “TIMESTAMP”, it produces “WITH_LA” instead of just “WITH”.
Thereby, we get 1 look-ahead from the scanner. Combined with the 1 lookahead
provided by bison, we can now parse our LR2 grammar.
Not sure if this would have been possible also with flex – but given we have a
hand-rolled parser it was straightforward.
You can find a similar hack also in
https://github.com/postgres/postgres/blob/master/src/backend/parser/gram.y#L721,
if you look for the WITH_LA keywords. Postgres is using a flex scanner and
then stacks a custom layer between flex and bison which introduces the
additional maintenance overhead.
Cheers,
Adrian
From: help-bison <help-bison-bounces+avogelsgesang=tableau.com@gnu.org> on
behalf of Daniele Nicolodi <daniele@grinta.net>
Date: Friday, 3 July 2020 at 23:15
To: Bison Help <help-bison@gnu.org>
Subject: Which lexer do people use?
Hello,
the historical pairing is using Flex with Bison. However, while Bison is
under active development and seems to be a very solid code base, there
isn't much activity on the Flex side
https://github.com/westes/flex<https://github.com/westes/flex> and
Flex codebase and capabilities show their age.
I recently became aware of RE/flex
https://www.genivia.com/reflex.html<https://www.genivia.com/reflex.html>
which seems very promising. However, it only generates a C++ scanner
which may be (I haven't tried) to retro-fit into existing C projects to,
for example, gain full unicode (in its utf8 encoded form) support.
Has anyone tried to hammer a C++ scanner peg generated by RE/flex into a
C grammar hole generated by Bison?
Which other scanners do people use?
Thank you.
Cheers,
Dan
Re: Which lexer do people use?, Derek Clegg, 2020/07/04
Re: Which lexer do people use?, Hans Åberg, 2020/07/04
Re: Which lexer do people use?,
Adrian Vogelsgesang <=