help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: about lr(1) parsers


From: Hans Aberg
Subject: Re: about lr(1) parsers
Date: Wed, 2 Aug 2006 19:44:06 +0200


On 2 Aug 2006, at 11:50, chinlu chinawa wrote:

Keywords, or hardwired names, are normally added
in the lexer, which may be generated by Flex. If
you want to have constructs such as 'define
<name>...', one way to do it is to make a lookup
table, where <name> is entered along with
syntactic and semantic data. When the lexer
finds a name, it checks the lookup table, and if
found, returns the correct token and passes the
semantic data to the parser.

Hello Hans,

Am, what you mean with semantic data?

With respect to the .y file, syntactic data is the grammar, and the semantic data is what you put into the actions, plus the other programming.

I've used your example, as I saw a simple cpp would be
a godd starting point.

However, I'm far from being able to do things like
this, if it's what you mean with semantic data:

#define max ((x)>(y)?(x):(y))

As it would require expansions and well, and a lot of
things I cannot do yet. Instead I'm using simple
constructions such like:

#define some   thing

You have a .y grammar rule that recognizes the '#define' token, plus the syntax of the definition. Then its action should put the definition name, 'max' (resp. 'some') onto the lookup table, plus its definition, '((x)>(y)?(x):(y))' (resp. 'thing'), in some suitable form. In addition, if you have more than one token type to be defined, that should be also on the lookup table.

Then the lexer (.l file) has a rule
  identifier {...}
that recognizes identifiers like 'max' (resp. 'some'). It then checks whether the name 'max' (resp. 'some') is present onto the table. If it is not, it will just return a %token identifier.

If the name is present on the lookup table, it will use the information there for the token return: token type plus the semantic value of the token '((x)>(y)?(x):(y))' (resp. 'thing').

So far, I've started to understad finite automatas,
have implemente one, together with linked lists rather
than perfect hashing and lookup talbes, that allows me
to start playing around #define and #ifdef more easily

Indeed, since I started to understand finite automata,
same has happend with lr(1) parsers, and I've got two
questions today, hope you can help.

A LR parser uses what is called a push-down automaton, which in addition has a stack of values.

In one hand, I don't seem to fully understand what a
finite automata should output, and how this output
should be feed to the parser.

If you use a program like Flex, the lexer it generates feed the parser that Bison generates with token numbers. You define these token numbers when putting %token ... in the .y file.

As you said, I'm using hardcoded keywords, and can
recognize for example when a #define, #ifdef or just a
comment is coming over, though none of them looks like
a good example for this question.

Supossing that I have this:

sum 1+1

And that I recognize `sum' with the finite automata,
what should I pass to the parser (appart from 1+1 I
guess)?

If 'sum' is a keyword, i.e., hardcoded, then you have in the .y file say
  %token SUM "sum"
  %token PLUS "+"
...
%%

summation:
   "sum" NUMBER "+" NUMBER { $$ = $2 + $4; } /* Or: SUM, PLUS */
  ...

The lexer (.l file) the have rules like
  "sum" { return SUM; }
  "+"   { return PLUS; }

If you want to have a loop, then you would need to build a "closure", i.e., some code that can be executed after the parsing has been taken place.


I though feeding a parser with `1+1' would output the
result of that operation, though if you look this:

http://en.wikipedia.org/wiki/LR_parser

At the end of the example part, it says:

"The rule numbers that will then have been written to
the output stream will be [ 5, 3, 5, 2 ] which is
indeed a rightmost derivation of the string "1 + 1" in
reverse."

And my question to this regard is, what I'm supossed
to do with that `[ 5, 3, 5, 2 ]'?

You do not really have to worry about that, if you only write your rule actions semantically correct. I.e., you only need to worry about how $$ should be computed from the $k values.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]