help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Locations suggest -- we're stupid


From: Nikos Balkanas
Subject: Re: Locations suggest -- we're stupid
Date: Sat, 12 Jan 2002 17:40:02 +0200

----- Original Message -----
From: Hans Aberg <address@hidden>
To: Nikos Balkanas <address@hidden>
Cc: Help Flex <address@hidden>
Sent: Friday, January 11, 2002 4:36 PM
Subject: Re: Locations suggest -- we're stupid


> At 19:17 +0200 2002/01/09, Nikos Balkanas wrote:
> >> Did you first, in the Flex source file main.c, check_options(), zip out
> >>   if ( do_yylineno )
> >>     /* This should really be "maintain_backup_tables = true" */
> >>     reject_really_used = true;
> >> recompile, and then do timings on a test with and without the %option
> >yylineno?
> >>
> >> If so, what did the test show, on which computer?
> >
> >Nothing of the sort.
>
> So then remove that line, and the redo the timing:

Definitely not. Read below.

> > Read from stdin into a buffer. yy_scanbuffer(). Fired
> >up yylex(). Timed it (not profiled it) running through the patterns once.
> >Then on EOF I yy_scanbuffer() again and throw yylex into a state that
counts
> >up new lines. Timed that too. This is faster than scanning yytext for
'\n's
> >using C. Showed consistent difference. By how much I don't remember.
>
> Please give info about computer and timing differences. If it was a long
> time ago, it may have changed, because the CPU's are now very fast
> relatively to what they can read from the hard disk. To there is more time
> over for examining each character.
>
> > There
> >is no point to it. No one is arguing to remove yylineno.
>
> There is still a point getting to know the difference.

Sorry it is a non - issue. I am not going to spent time benchmarking
redundant and, at times, futile code. See below.

> But with Unicode in place, one may hook up additional code converters
> before the characters ends up with the Flex lexer, so the difference will
> be even less then.

This doesn't belong here. You want to talk about unicode, pls. start another
thread. If you plan to use flex at previous steps the problem multiplies.
Otherwise it remains constant. Definitely it doesn't reduce or go away.

> >I use overall time which is affected by I/O. One cannot do anything about
> >disk I/O and the system calls for doing it. One can hope only to make
fast
> >software on top of that. Remember times are additive (Unless flex does
I/0
> >on a seperate thread which currently doesn't). Then again you have people
> >that, realizing this, work from memory instead of a hard disk (after the
> >first pass use
> >yy_scanbuffer for subsequent passes). Furthermore, in development time
when
> >one debugs a particular input, many OS's (linux for example) will cache
that
> >input into memory (writing to disk, however, is the big hog, which cannot
be
> >cached).
>
> The point is that on a modern computer system with paging, buffers, there
> is no real way to ensure that those bottlenecks do not appear.

There are no bottlenecks in blocking I/O. Everything is cummulative. Every
little delay on top of what is necessary will count. And yes, there may be
some way to do smt about these things. But it is not gonna do much good if
the lexxer is slow.

> >Sorry, no excuse for slow code :-)
>
> The main point is always that it does not matter if the code is slow if is
> not executed frequently enough relative to the other stuff executed in the
> program.

Sure just put a few futile loops (why not infinite?) in places that are not
executed frequently. And let the user find out about them. Surprise him! :-)

> It may sound as a big over head, checking each character, but that is
> already done in a number of times in buffers and code converters, etc. And
> most overall time in a parser is typically not executed in the parser, but
> in the actions.

Yes you are right. A lot of converters and buffers that are allready out
there are implemented wrong. They are not using flex.
And despite my best efforts, you still don't get it. Running a slow and
heavy parser over your slow lexxer, won't make your lexxer run faster.
However, with a fast lexxer you will save some time and then be one step
closer to make your parser faster.

I haven't said that it is a big overhead. Just an overhead. When careless
overheads tend to pile up...

[...snip...]
> >I consider that the surest way to improve as a programmer is to consider
> >one's
> >own time inexpensive. Then one can turn some really nice (reusable) code.
Of
> >course that means putting extra time to it at off hours, initially.
>
> That was the case in the earlyu days of compuers. This is not anymore the
> case: Programmer time is always the most expensive part.

I guess the motto is: bigger (and slower) software is better. Well, that's
Miscrosoft's philosophy. Is this what flex has become? Certainly didn't
start that way.

Flex's trademark is speed and pattern flexibility. I need it that way to do
my work.

Consider the following (very common actually) code snipets:

A)

<STATE1>{
.                        {
                               char *p = yytext;

                               while (*p) if ((*p)++ == '\n') yylineno++;
                          }
\n                      {
                               char *p = yytext;

                               while (*p) if ((*p)++ == '\n') yylineno++;
                          }

B)
<STATE2>{
.                          {}
\n                        {  yylineno++; }
}

I would be embarassed writing (A). Guess what? This code is executed
on pattern (B) if you use %option yylineno (on top of REJECT).

> >However, this is of no interest to the list.
>
> As a development strategy for Flex, it may, in view of the heated and long
> discussions about lineno and locations.

I don't thing we have any business discussing development strategy for flex.
At least not me. I am just discussing this from the user standpoint.

[..snip..]
> > My argument based on the KISS principle
>
> You have not yet explained what KISS is short for.

Basically I replied this thread only for this point:

KISS is short for Keep It Simple Stupid. I am not calling any names, just
the principle. Time honored, proven right, time after time. Suddenly, I
realized
that the new generation might be missing a few cornerstones of
programming (sigh!).

[...snip...]
> Clearly, if one is hooked onto a fixed, predetermined grammar that will
not
> change, than all kinds of tricks can be made to optimize that.

Nope. That's not the case. I am talking about a general alternative to
yylineno. From binary, to decimal, to ascii and pseudo-ascii input.

> >Furthermore in the best case (no bugs - completely isolated code) there
will
> >be a delay in the parser generation (flex step is quite fast though).
>
> The addition to the Flex compile times itself are clearly negligable.

Remains to be seen. Granted it can be small. Nothing is negligible when
developing a tool that's gonna be used a million times. It is going to pay
you back, eventually.

> > In
> >addition it will make for a larger flex source, more difficult to
maintain.
>
> Not much.
>
> >Akim seems to think that it might be unfeasible.
>
> I think that Akim thinks about things that are more extensive than ever
> proposed here.

Granted. However, I too, share the feeling that it is going to be more
difficult than what you think. Don't forget yylineno code is much simpler,
and it took almost 10 years(!) to discover a major bug in it (REJECT).

Hans, I admire your resiliancy and commitment to maintain this thread (and
several others, too). However, I will have to be excused, time not
permitting. For the record, I' ve never argued to remove yylineno, neither
am I telling people not to use it. I am just worried, like a few others, of
the features to be implemented and their utility. yylineno just served as an
example in the discussion.

Nikos






reply via email to

[Prev in Thread] Current Thread [Next in Thread]