help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Featuritism


From: Hans Aberg
Subject: Re: Featuritism
Date: Wed, 9 Jan 2002 20:25:21 +0100

At 14:51 +0100 2002/01/09, Akim Demaille wrote:
>1. computation of the location.
>
>I.e., completely help the beginner.  But then, as I explained, you
>have to explain to Flex what is skipped (again, think of the
>difference between a start condition for strings, and another for
>comments).

The location should only be that of the token returned, that is, [yytext,
yytext+yyleng).

>  You also need to get into gory details, such as how
>tabulations are understood by the application.

The Flex scanner should avoid that, instead provide a pointer to the line,
if that is needed, for further computations, if the use is for errors
alone. Anything else will generate unnecessary overhead.

If you have some other applications than errors in your mind, please give
some examples.

>  Worse yet: in
>multibyte characters, do you count bytes or characters?

All points towards that for efficent programming with Unicode, one should
use single width characters. As Unicode uses 21 bits (0x000000..10FFFF I
think), that will most likely be 32-bit characters (UTF-32), the first
typical alignment.

So, then this becomes a non-issue.

But the stream position (ftell) numbers are independent of this.

>2. computation of something that can ease the computation of the
>locations (typically the current position).
>
>But then, the author of the scanner will have to put some operation to
>compute the locations from the position at many different places.

I am not sure what you mean here: If the buffer keeps track of the stream
(ftell) position, it can be used to tell the location of the token found
itself.

>As a conclusion, the most important feature in the scanner for
>location tracking is the ability to have means to specify some
>`automatic' actions.

The stream position cannot be found, unless the buffer stores it, and
provides a separate function for it.

>  Flex provides an immense help in this area:
>YY_USER_ACTION *and* the yylex snippet in

One suggeston was for special macros YY_LOCATION_BEGIN (probably executed
just before YY_USER_ACTION) and YY_LOCATION_END (executed just before the
rule action itself).

>----------------------------------------
>From: Richard Stallman <address@hidden>
>    The GNU coding standard makes no suggestion wrt locations which
>    include a range (starting point, ending point, both described as
>    (line, column)).
>
>It could be a reasonable idea, but is it really useful?
>I can imagine it would lead to a lot more work in compilers
>to make them generate meaningful ranges, and I am not sure
>whether it would help users much.

I can add that my IDE provides for _not_ indicating various information:

The extra of line of context could be set to NULL, in which no such line is
displayed. On this line, one can set the position at the beginning of the
line, in which the line would not be marked. Similarly for the position
numbers, identifying a token or location in the stream file, that might be
highlighted when clicking on it in the IDE.

All these extras are optional, and programs are not required to supply
them. But if one supplies them, there is a format readily available.

>    As a user I'd say yes, this is definitely useful.  Several times, in
>    particular with C++, I had gcc error messages which I could not
>    interpret because I didn't know which item on the line was the
>    culprit.
>
>Simply giving one column number would be enough to show you where in
>the line the problem is, wouldn't it?  Would the range, beginning and
>end, really be a big improvement?

It is difficult to provide good errors in C++, because it relies more
heavily on context information than say C: Some stuff is not correctly
defined earlier, or some error occurs during template expansion or
something, making it difficult to identify the error.

However, taking that problem apart, a typical error would be displayed like
(and it is not the format I want to get at, but the information it provides
the human):
Error   : declarator expected
lex.pg_parser.cc line 566   do int 5;
                                   -
with the failing token unlined ("5" in the example). The "  do int 5;" is a
snippet from the failing line, with the token underlined, or simply the
position marked (if it is not a location).

In addition, one can click on the display, and the file in question would
pop open, selecting the line in question. For this, the IDE needs the file
descriptor (which is located at compile time when the file name is searched
for), plus the delimitation of the token or location in terms of file
position numbers. The rest of the selection process is actually due to the
OS.

As a compiler user, I find this information very useful, because one can
quickly identify the error, and get to the error by merely clicking on it.
When I made the Flex/Bison plugins (DLL's), I was intrigued that it is
relatively simple to achieve these effects.

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]