help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Featuritism


From: Akim Demaille
Subject: Re: Featuritism
Date: 09 Jan 2002 14:51:58 +0100
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp)

>>>>> "John" == John W Millaway <address@hidden> writes:

John> It is important to note that every effort is made so that new
John> features in flex are OPTIONAL. If a feature is disabled, the
John> scanner is generated as if that feature did not exist.. A new
John> feature does NOT necessarily affect the performance of the
John> scanner. Besides, a few new features in 10 years can hardly be
John> called, "featuritis!"

I agree, and I must have forgotten an `m', I meant featurism :)
And Unicode does seem like the heck of a feature to me :)


As long as it is optional, it has no impact on me, so it cannot bother
me.


I think I should summarize my point of view, and certainly clarify it.
I apologize if it was too obscure, and worse yet, if it seemed to be a
simple subjective opinion, not motivated by actual facts.


To give the full picture of it, I should start saying that _I_ started
the big mess about locations because, as Hans, I grew tired of the
incredibly imprecise messages from most of the free software tools
(have you ever split a line with several expressions to know which one
is triggering the error?).

This is a long road, since you cannot just arrive on GCC mailing list
and ask for the completion of the error messages.  The road starts
with having a good prototype that saves the programmers from the
hassle of keeping track of the locations.  That is the very reason why
I first dove into Bison, and eventually became the maintainer:
locations.

Then you have to experiment, equip several tools to make sure the idea
is sound, and can be extended to all the tools.  Of course, the
initial scheme was not sufficient, and it slowly grew.  Today I have
converted a fair number of applications (must of them being personal,
but Bison is now using this scheme, the snippets I sent you are part
of an O'Reilly book etc.).

I should say that wrt to Bison the case is closed: I know the system
could have been better, but as of today, it is sufficient.

I am a _big_ fan of simplicity.  Just have a look at a2ps, and you'll
see what I mean; people without any idea of what PostScript, n-upping,
Texinfo, LaTeX, HTML, PDF etc. are can nevertheless perform quite
complex jobs.

So naturally for me the next step was having Flex do the other part of
the job.  That's also why years ago I asked Vern who was the
maintainer, and that's also why I recently came back to Flex.

In the meanwhile I had accumulated some experience of location
tracking at the scanner level.  And I think it is not feasible
properly.  Of course I might be wrong, I am not claiming I'm right,
but I wanted to give you the list of the troubles that arise if you
dig that way.


First, a point of vocabulary.  What Bison names a location is the
information about the position of some symbols (tokens or
nonterminals).  Typically that information is composed of two
positions: the initial position, and the final position.  But the user
is free to choose her implementation.  So I will proceed with
`location' = +/- one range, and `position' = +/- one point.

Flex can provide two different services:



1. computation of the location.

I.e., completely help the beginner.  But then, as I explained, you
have to explain to Flex what is skipped (again, think of the
difference between a start condition for strings, and another for
comments).  You also need to get into gory details, such as how
tabulations are understood by the application.  Worse yet: in
multibyte characters, do you count bytes or characters?

I think there is just no hope to achieve this goal: the very
definition of a location is just too dependent of the language you
consider.  In addition, not all users need the same location
definition.


2. computation of something that can ease the computation of the
locations (typically the current position).

But then, the author of the scanner will have to put some operation to
compute the locations from the position at many different places.
Actually, if you think about it this is exactly the code that I sent
as a example.  Therefore, you simplified almost nothing, and introduce
a feature that needs a lot of options (e.g., characters, or bytes?).



As a conclusion, the most important feature in the scanner for
location tracking is the ability to have means to specify some
`automatic' actions.  Flex provides an immense help in this area:
YY_USER_ACTION *and* the yylex snippet in

%%
%{
  here
%}

With that you have everything you need.




I hope this clarifies what I meant.

I invite you to implement a prototype of what you have in mind.  I
have several such applications, and this led me to this conclusion.


PS/  So why is it so simple for Bison then?  Because all the mess is
about defining the value of a location.  Performing operations on
locations is then much easier and much more uniform.

PS2/ My plan about location is almost done.  I am waiting for Bison
1.31 to be released, then I start working on GCC's C parser, trying to
equip it with better locations.  If it works, I ask for the
integration, and try to adjust the GNU coding standards for better
messages.


----------------------------------------
From: Richard Stallman <address@hidden>
Subject: Re: Error messages with fine locations
To: address@hidden
Date: Tue, 24 Oct 2000 18:44:25 -0600 (MDT)
Reply-to: address@hidden

    The GNU coding standard makes no suggestion wrt locations which
    include a range (starting point, ending point, both described as
    (line, column)).

It could be a reasonable idea, but is it really useful?
I can imagine it would lead to a lot more work in compilers
to make them generate meaningful ranges, and I am not sure
whether it would help users much.

Why do you ask?

    input.tig:1.4-2.6: type mismatch

If we are going to have a format for this, that format seems
reasonable to me.  But before I could agree to it, someone should
implement it in compile.el of Emacs just to make sure it won't be a
pain in the neck to do so.

    input.tig:123.4-6: type mismatch

I have nothing against that.

----------------------------------------
From: Richard Stallman <address@hidden>
Subject: Re: Error messages with fine locations
To: address@hidden
Date: Wed, 25 Oct 2000 18:39:11 -0600 (MDT)
Reply-to: address@hidden

    |     The GNU coding standard makes no suggestion wrt locations which
    |     include a range (starting point, ending point, both described as
    |     (line, column)).
    | 
    | It could be a reasonable idea, but is it really useful?

    As a user I'd say yes, this is definitely useful.  Several times, in
    particular with C++, I had gcc error messages which I could not
    interpret because I didn't know which item on the line was the
    culprit.

Simply giving one column number would be enough to show you where in
the line the problem is, wouldn't it?  Would the range, beginning and
end, really be a big improvement?

(Of course, making GCC do either one would require substantial work on
GCC.)

    In addition support for ranges comes for free if you use Bison, i.e.,
    *it* computes the ranges.  It doesn't require any more work than plain
    filename + lineno.

Interesting.  That could make it much easier in GCC too.

Do you want to work on compile.el?
----------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]