[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Interactive parsing with Bison

From: Hans Aberg
Subject: Re: Interactive parsing with Bison
Date: Fri, 23 Jun 2006 22:42:41 +0200

On 23 Jun 2006, at 22:11, Satya wrote:

As for letting the Bison parser doing the things you want, be able to
step through the grammar, the information is basically already there,
in the .output file that Bison produces. Essentially, the "." in each
state tells exactly how far the parsing has proceeded in each rule.
So one needs to extract which state is the current (actually stacked
in the parser stack), and then the IDE displays an arrow in the .y
grammar for each rule in that state at the point the "." where the
state is.

Yes, there is a lot of information in the .output file we could use;
but this is intended for a human reader;

What I am saying is this:

For a computer programming language, the IDE displays arrows in the stacked functions, where the program counter currently is. For each function call, there is only one arrow: there is only one program counter for each subroutine.

Now, if one should do the same thing with the .y grammar file that is fed to Bison, then, as the parsing proceeds over sets of rules, one ends up with: a stack of parser states, and for each state, a set of arrows. This corresponds to a function call with multiple program counters.

The position of these arrows is what is listed in the .output file: for each state, a list of rules, and in each rule a "." telling where parsing currently is in the .y grammar. So one reverse-engineers the generation of this .output file, and puts a suitable form of it into the .debug file.

My idea of an IDE is this:

1. The user composes his grammar using the IDE/ loads a grammar from a
file. The IDE starts bison as a child process.

2. Bison is started with a special command line flag so it knows to go
into a 'Debug mode'

3. In this mode Bison will stop just after computing all tables for
the input grammar (tables_generate()) and dump all details (symbol
table, rules, ritems, states, and various tables from tables.c) into a file. The file is in a fixed pre-determined format; It
could be plain text or binary (so we get to be 'sophisticated' )

4. The IDE process will now read this file and with this information
at hand, it can start a 'parse-simulation' on a test input file.

5. This will be done using a built in yyparse() (programmed in the
IDE) that stops after each step (a shift or reduce)

6. The user can choose to alter the input/ watch the stack, state etc,
under the control of the IDE

7. If the user wants to change the grammar in the middle of a parse,
the tables will have to re-computed and a fresh file is
created by executing bison once again; Then the input is re-parsed
until the current token is reached;

8. When the user is done perfecting his grammar, he issues a compile
command that will instruct bison to read the file,
execute m4 and generate the file.

Building upon the above steps, we can implement breakpoints, watches,
and traces using an IDE.

Instead of communicating via files, we could also implement an IPC
between bison and the IDE; If this IPC uses sockets, we can implement
a bison-server and have a web service that clients can use to get
their parsers done; (just tryin to talk a bit futuristic here)..

Also, note that this glossed over candy idea has a potential flaw; we
still rely on a yylex() to supply tokens for us! where would that come
from?! By some way, we have to compile the yylex function (either
supplied by the user or produced by flex) and link it to our debugger
dynamically at the time of simulated parse. Do you think that's
(huh, this must be the reason why all 'visual parser' IDEs I have seen
also implement a lexer as part of their 'grammar' so they have more

One ends up with a stack of function calls, where the yyparse() function is an intermediate. If the yyparse() is called, one should be able to see the parser stack, and if a state is selected, one should be able to see a set of arrows for that state in the .y grammar. But one would want to be able to set steps in the yylex(), as well as in the functions that may be called in the parser actions.

If there already is some IDE with the capacity of displaying multiple
arrows as above, then this would vastly save the programming time

Yes, that's a good idea; I am looking at Eclipse, KDevelop  and Ajunta
(for GNOME). Probably Eclipse would be a good choice; Emacs is also a
good idea; (I am hoping I wouldn't be kicked out of here if I said I
always use vi :)

But to start with, we need to perfect the debugging interface before
an IDE can be brought into picture. I am willing to implement it if we
all agree on this and come up with a good format for the .debug file.

The trick will be to find a good debugging format, which the IDE can use for its display. The only possible extra feature that comes to my mind that IDE must have, is the ability to display multiple arrows. But it should already have that, in order to be able to display the stacked function calls.

  Hans Aberg

reply via email to

[Prev in Thread] Current Thread [Next in Thread]