[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Interactive parsing with Bison

From: Satya
Subject: Re: Interactive parsing with Bison
Date: Mon, 26 Jun 2006 00:37:47 -0500

On 6/25/06, Richard Stallman <address@hidden> wrote:
    First, the debugger IDE must be able to display multiple arrows, as
    though there were multiple program counters in each function call.
    The reason for this, is because parsing proceeds over sets of rules,
    with a parsing position marked in each of them. (This is the stuff
    displayed in the Bison generated .output file of each state.)

We could surely modify GDB and gdb-ui.el to do that.

My concern is this - are users of bison ready to wade through
file (or in case of bison the ?

The file is full of macros and unavoidable clutter; anyone who is
willing to understand this file can happily squeeze it  through gdb
and debug their grammars just with the additional help of the .output

The commercial tools available today (there aren't many by the way)
take an alternative route - they let you compose your grammar in a
nice IDE and then point out all the conflicts, simulate a parse on a
test input file even before the final parser code is generated.

I can see that users of such tools are more interested in getting
their language correctly parsed  than trying to run the generated
parser thru a debugger - In other words, users typically are not
interested in the implementation mechanics of the parser.

(By implementation mechanics, I am referring such nitty-gritty details
as, for example, the array yyprhs[i] contains the first RHS symbol
number of rule i. No, I don't think users will take all the hardship
to decode that so that they can debug;

The IDE is primarily concerned with getting your job done with minimum
knowledge of internals of the generated parser. It is not merely a

But all these ideas can still be implemented by modifying gdb
sufficiently so that it will understand grammars and lexers in the
same way it understands C or C++ or Ada.

This 'understanding' will come with a little help from 'debugging
information' generated by Bison and flex. So the user may be able to

$ gdb grammar.y lexer.l

[gdb will run flex and bison, collect state and table information and
get ready to simulate a test-parse step-by-step - whether it is going
to do this by compiling and running the generated parser (
thats a design decision - most irrelevant to the user.]

(gdb) test-parse testfile.c
Entering State 0
Stack : 0
Next input token: INT
(gdb) step
Shifted Token INT
Entering State 3
Stack: 0 3
Next input token: ID
(gdb)show state 3
State 3 is:
Rule #3: X -> alpha . beta zeta
Rule #4: beta -> .theta gamma

and so on; the user is free to tweak the grammar a bit,

Once this is done, it should be easy to implement this process through
a good graphical IDE that can interface with gdb.

One of the main challenges of implementing such a system is that Bison
depends on the external yylex to get its tokens. It becomes necessary
to tweak flex also!

Other commercial systems have taken a short cut and provided
facilities to describe the lexer specification as part of the grammar.
I was thinking if we can include a lexer specification into Bison:

%token TOK_IF   "if"  { //Action for if }
%token TOK_NUMBER "[0-9]+" { //action for number }

though this would be breaking an age old tradition of separating the
lexer and parser... (which has got its own advantages);

These are my thoughts on this; Please do let me know your's. I got
interested in this whole idea when I saw "visual parse++" (by ; it is an awesome tool both for generating
industry grade parsers and understanding about parsing; This company
has got over a 100 clients including some big companies like VISA and
US army :) and many universities including Yale! So I am sure such a
tool would be of enormous utility to the community.


"When you have eliminated the impossible, whatever remains, however
improbable, must be the truth".
-Sherlock Holmes, The sign of four.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]