[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCHES] example calc++
Re: [PATCHES] example calc++
Tue, 27 Nov 2018 06:42:03 +0100
> Le 26 nov. 2018 à 15:31, Jannick <address@hidden> a écrit :
>> doc: calc++: ignore \r in the scaner
> (minor: a typo.)
Bummer. Too late, but thanks!
>> * doc/bison.texi (Calc++ Scanner): Ignore \r.
>> diff --git a/doc/bison.texi b/doc/bison.texi index 01d7c8dc..371cec84 100644
>> --- a/doc/bison.texi
>> +++ b/doc/bison.texi
>> @@ -11897,7 +11897,7 @@ Abbreviations allow for more readable rules.
>> id [a-zA-Z][a-zA-Z_0-9]*
>> int [0-9]+
>> -blank [ \t]
>> +blank [ \t\r]
>> @end example
> This code snip appears not only in the sample code files, but in bison's
> documentation, too. So, for didactic reasons, I would be hesitating to put
> '\r' into a regexp named 'blank', albeit ultimately no difference to the
> scanner, of course. The most natural way I wanted to handle this without
> changing to much would have been to define a regexp dealing with any kind of
> line break, however, that collides with the line counting
> [\n]+ loc.lines (yyleng);
> This was the reason I put '\r' into a separate line to simply ignore them.
> Thinking about that a bit more and generalizing this for probably almost all
> operating systems out there, I believe /(\r\n|\n\r|\n|\r)/ would be a good
> pattern for a line break to start off with and the line counter must be
> adjusted to. Suggestion (using flex'es greediness):
> ((\r\n)+|(\n\r)+) loc.lines (yyleng/2);
> (\n+|\r+) loc.lines (yyleng);
> This would work even for mixed EOL files (with both '\n' and '\r\n') on my
> Windows machine, where I have never seen a file with both '\n' and '\r' as
> single EOL character. Not tested.
You've just shown why I don't want to do it and why I treated \r
as a blank :) It's intended to be a simple example, and I don't
want to do that at all.
>> We don’t use a reentrant scanner, and we don’t call yylex_init, so I don’t
>> think we should call yylex_destroy.=
> This is a probably not so well documented issue for __non-reentrant__ flex
> scanners: flex allocates memory for the stack object 'yy_buffer_stack' every
> buffer struct is pushed onto. For reentrant scanners the stack is a member of
> the yyscanner object, for non-reentrant ones it is a global variable.
> Regardless of the scanner type 'yyensure_buffer_stack' (re)allocates memory
> for the stack object to make sure that it exists and the stack has sufficient
> size. This happens, e.g., when 'yylex' is called the first time. If, for a
> non-reentrant scanner, we do not call 'yylex_destroy' at the end of the
> program, there is memory leakage. Same applies to reentrant scanners which
> is more known.
> So, still my suggestion for the non-reentrant scanner calc++ to install a
> patch making sure that 'yylex_destroy' is called at the end.
Well, then Flex must first fix its documentation. The documentation
clearly states that this feature is for reentrant scanners, which
is not our case currently. I don't want to use an undocumented
feature, because I don't want to commit Bison to features for which
there is no clear contract with the developer, and conversely, I
don't want Flex maintainers to feel constrained to maintain a
feature that might have simply slipped out unknowingly.
That's a lesson I learned from Bison itself. Sometimes people start
using "features" that are mere implementation details that we feel
free to change. The contract is the documentation.
Also, please note that this is not a leak. Leaking is losing
access to allocated memory. Here, the memory is still perfectly
reachable, it's only not freed at the end. Rest assured
that the OS will get its memory back afterwards.