[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: how to recognize a multiple line comment ?
From: |
Vincent Zweije |
Subject: |
RE: how to recognize a multiple line comment ? |
Date: |
Thu, 7 Apr 2005 14:20:52 +0200 |
Bruce Lilly wrote:
> On Thu April 7 2005 02:29, James Yu wrote:
> > Dear all,
> >
> > I am trying to define a rule for recognizing a multiple line comment
in C,
> > but I have not yet sueccessed.
> > Thus, I am posting my lex segment to this email and hope you can
point out
> > some mistakes for me.
> >
> > == flex code segment ==
> > slash "/"
> > asterisk "*"
> > comment ({slash}{asterisk}+([^*]|[\n])*{asterisk}+{slash})
> > == flex code segment ==
>
> Don't try to do too much in lexical analysis; some things are better
> handled in parsing under control of a grammar, where context is
> available.
>
> For example, your pattern won't handle the legal C comment
> /* foo ***** bar */
That's exactly why he came here. It didn't work.
It also sounds like a classical homework assignment. ;-)
> and it will inappropriately match the text in the quoted string in
> char foo[] = "/*bar*/";
That depends on the rest of the token definitions (the string
token, to be precise).
> A properly-designed grammar will also allow resolving conflicts by
> using precedence and associativity to handle complex cases
Opinions probably differ about that statement.
> int x, y, z, *px;
> char foo[3];
>
> x = 2;
> px = &x;
> y = 8;
> z = y/*px;
> ...
> strncpy(foo, "*/", sizeof(foo));
>
> i.e. you have the opportunity to decide whether * binds more tightly
> to / for comments than as a pointer indication.
You must be concluding that the C syntax is not properly
designed, because it will take the /*px ... "*/ as a comment and
reject the program as invalid. You might have a point there
though. :-)
I would *definitely* not want to have to resolve this problem
using precedence.
However, you are mixing tokenisation (flex task) and parsing
(yacc/bison task). They are based on different kinds of
language (regular versus context-free), and both have their
advantages and disadvantages.
Ciao. Vincent.