bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: document new features of parse.error


From: Adrian Vogelsgesang
Subject: Re: document new features of parse.error
Date: Mon, 27 Jan 2020 09:19:05 +0000
User-agent: Microsoft-MacOutlook/10.10.b.190609

Hi Akim,

do you think it makes sense to deprecate `verbose` already with the 
introduction of `detailed`?
As far as I see it, there is no good reason to use `verbose` over `detailed`, 
or am I missing something?

Also:
> +Control the generation syntax error messages. @xref{Error Reporting}.
Control the generation *of* syntax error message

Cheers,
Adrian

From: bison-patches <bison-patches-bounces+avogelsgesang=address@hidden> on 
behalf of Akim Demaille <address@hidden>
Date: Monday, 27 January 2020 at 06:54
To: Bison Patches <address@hidden>
Subject: RFC: doc: document new features of parse.error

If someone feels s/he's good at writing technical material, I would appreciate 
some help.

This is not enough, I will have to add details about alias internationalization 
for instance. And cover the other languages when we're done.

However, I think we should start using this features in real projects right now 
(maybe releasing an alpha would help) so that we can check that it does address 
the problems we found. Of course I don't expect these projects to merge the PR 
yet, but we really need to see how these changes behave in the wild.

Cheers!


commit b7955e2a0943e998de5d0736c85a80bd480ce58c
Author: Akim Demaille <address@hidden>
Date: Sat Jan 25 17:26:59 2020 +0100

doc: document new features of parse.error

* doc/bison.texi (Error Reporting): Rename as...
(Error Reporting Function): this.
Adjust dependencies.
Make it a subsection of this...
(Error Reporting): new section.
(Syntax Error Reporting Function): New.
(parse.error): Update description.

diff --git a/NEWS b/NEWS
index b764c81a..b4f38496 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,67 @@ GNU Bison NEWS
(2013-07-25), "%error-verbose" is deprecated in favor of "%define
parse.error verbose".

+** New features
+
+*** Improved syntax error messages
+
+ Two new values for the %define parse.error variable offer more control to
+ the user.
+
+**** %define parse.error detailed
+
+ The behavior of "%define parse.error detailed" is closely resembling that
+ of "%define parse.error verbose" with a few exceptions. First, it is safe
+ to use non-ASCII characters in token aliases (with 'verbose', the result
+ depends on the locale with which bison was run). Second, a yysymbol_name
+ function is exposed to the user, instead of the yytnamerr function and the
+ yytname table. Third, token internationalization is supported (see
+ below).
+
+**** %define parse.error custom
+
+ With this directive, the user forges and emits the syntax error message
+ herself by defining a function such as:
+
+ int
+ yyreport_syntax_error (const yyparse_context_t *ctx)
+ {
+ enum { ARGMAX = 10 };
+ int arg[ARGMAX];
+ int n = yysyntax_error_arguments (ctx, arg, ARGMAX);
+ if (n == -2)
+ return 2; // Memory exhausted.
+ YY_LOCATION_PRINT (stderr, *yyparse_context_location (ctx));
+ fprintf (stderr, ": syntax error");
+ for (int i = 1; i < n; ++i)
+ fprintf (stderr, " %s %s",
+ i == 1 ? "expected" : "or", yysymbol_name (arg[i]));
+ if (n)
+ fprintf (stderr, " before %s", yysymbol_name (arg[0]));
+ fprintf (stderr, "\n");
+ return 0;
+ }
+
+**** Token aliases internationalization
+
+ When the %define variable parse.error is set to `custom` or `detailed`,
+ one may use the _() annotation to specify which token aliases are to be
+ translated. For instance
+
+ %token
+ PLUS "+"
+ MINUS "-"
+ EOF 0 _("end of file")
+ <double>
+ NUM _("double precision number")
+ <symrec*>
+ FUN _("function")
+ VAR _("variable")
+
+ In that case the user must define _() and N_(), and yysymbol_name returns
+ the translated symbol (i.e., it returns '_("variable")' rather that
+ '"variable"').
+
* Noteworthy changes in release 3.5.1 (2020-01-19) [stable]

** Bug fixes
@@ -3881,7 +3942,9 @@ along with this program. If not, see 
<http://www.gnu.org/licenses/<http://www.gnu.org/licenses/>>.
LocalWords: Wdeprecated yytext Variadic variadic yyrhs yyphrs RCS README
LocalWords: noexcept constexpr ispell american deprecations backend Teoh
LocalWords: YYPRINT Mangold Bonzini's Wdangling exVal baz checkable gcc
- LocalWords: fsanitize Vogelsgesang lis redeclared stdint automata
+ LocalWords: fsanitize Vogelsgesang lis redeclared stdint automata yytname
+ LocalWords: yysymbol yytnamerr yyreport ctx ARGMAX yysyntax stderr
+ LocalWords: symrec

Local Variables:
ispell-dictionary: "american"
diff --git a/doc/bison.texi b/doc/bison.texi
index a3b947b0..844e21b5 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -305,7 +305,7 @@ Parser C-Language Interface
* Parser Delete Function:: How to call @code{yypstate_delete} and what it 
returns.
* Lexical:: You must supply a function @code{yylex}
which reads tokens.
-* Error Reporting:: You must supply a function @code{yyerror}.
+* Error Reporting:: Passing error messages to the user.
* Action Features:: Special features for use in actions.
* Internationalization:: How to let the parser speak in the user's
native language.
@@ -322,6 +322,11 @@ The Lexical Analyzer Function @code{yylex}
* Pure Calling:: How the calling convention differs in a pure parser
(@pxref{Pure Decl, ,A Pure (Reentrant) Parser}).

+Error Reporting
+
+* Error Reporting Function:: You must supply a function @code{yyerror}.
+* Syntax Error Reporting Function:: You can supply a function 
@code{yyreport_syntax_error}.
+
The Bison Parser Algorithm

* Lookahead:: Parser looks one token ahead when deciding what to do.
@@ -5437,13 +5442,13 @@ reentrant. It looks like this:

The result is that the communication variables @code{yylval} and
@code{yylloc} become local variables in @code{yyparse}, and a different
-calling convention is used for the lexical analyzer function
-@code{yylex}. @xref{Pure Calling, ,Calling Conventions for Pure
-Parsers}, for the details of this. The variable @code{yynerrs}
-becomes local in @code{yyparse} in pull mode but it becomes a member
-of @code{yypstate} in push mode. (@pxref{Error Reporting, ,The Error
-Reporting Function @code{yyerror}}). The convention for calling
-@code{yyparse} itself is unchanged.
+calling convention is used for the lexical analyzer function @code{yylex}.
+@xref{Pure Calling, ,Calling Conventions for Pure Parsers}, for the details
+of this. The variable @code{yynerrs} becomes local in @code{yyparse} in
+pull mode but it becomes a member of @code{yypstate} in push mode.
+(@pxref{Error Reporting Function, ,The Error Reporting Function
+@code{yyerror}}). The convention for calling @code{yyparse} itself is
+unchanged.

Whether the parser is pure has nothing to do with the grammar rules.
You can generate either a pure parser or a nonreentrant parser from any
@@ -6095,8 +6100,8 @@ used, then both parsers have the same signature:
void yyerror (YYLTYPE *llocp, int *nastiness, char const *msg);
@end example

-(@pxref{Error Reporting, ,The Error
-Reporting Function @code{yyerror}})
+(@pxref{Error Reporting Function, ,The Error Reporting Function
+@code{yyerror}})

@item Default Value: @code{false}

@@ -6509,22 +6514,41 @@ constructed and destroyed properly. This option checks 
these constraints.
@item Languages(s):
all
@item Purpose:
-Control the kind of error messages passed to the error reporting
-function. @xref{Error Reporting, ,The Error Reporting Function
-@code{yyerror}}.
+Control the generation syntax error messages. @xref{Error Reporting}.
@item Accepted Values:
@itemize
@item @code{simple}
Error messages passed to @code{yyerror} are simply @w{@code{"syntax
error"}}.
+
+@item @code{detailed}
+Error messages report the unexpected token, and possibly the expected ones.
+However, this report can often be incorrect when LAC is not enabled
+(@pxref{LAC}). Token name internationalization is supported.
+
@item @code{verbose}
+Similar (but inferior) to @code{detailed}.
+
Error messages report the unexpected token, and possibly the expected ones.
However, this report can often be incorrect when LAC is not enabled
(@pxref{LAC}).
+
+Does not support token internationalization. Using non-ASCII characters in
+token aliases is not portable.
+
+@item @code{custom}
+The user is in charge of generating the syntax error message by defining the
+@code{yyreport_syntax_error} function. @xref{Syntax Error Reporting
+Function, ,The Syntax Error Reporting Function
+@code{yyreport_syntax_error}}.
@end itemize

@item Default Value:
@code{simple}
+
+@item History:
+introduced in 3.0 with support for @code{simple} and @code{verbose}. Values
+@code{custom} and @code{detailed} were introduced in 3.6.
@end itemize
@end deffn
@c parse.error
@@ -6826,7 +6850,7 @@ in the grammar file, you are likely to run into trouble.
* Parser Delete Function:: How to call @code{yypstate_delete} and what it 
returns.
* Lexical:: You must supply a function @code{yylex}
which reads tokens.
-* Error Reporting:: You must supply a function @code{yyerror}.
+* Error Reporting:: Passing error messages to the user.
* Action Features:: Special features for use in actions.
* Internationalization:: How to let the parser speak in the user's
native language.
@@ -7265,8 +7289,21 @@ int yylex (YYSTYPE *lvalp, YYLTYPE *llocp,
int yyparse (parser_mode *mode, environment_type *env);
@end example

+
@node Error Reporting
-@section The Error Reporting Function @code{yyerror}
+@section Error Reporting
+
+During its execution the parser may have error messages to pass to the user,
+such as syntax error, or memory exhaustion. How this message is delivered
+to the user must be specified by the developer.
+
+@menu
+* Error Reporting Function:: You must supply a function @code{yyerror}.
+* Syntax Error Reporting Function:: You can supply a function 
@code{yyreport_syntax_error}.
+@end menu
+
+@node Error Reporting Function
+@subsection The Error Reporting Function @code{yyerror}
@cindex error reporting function
@findex yyerror
@cindex parse error
@@ -7284,7 +7321,7 @@ called by @code{yyparse} whenever a syntax error is 
found, and it
receives one argument. For a syntax error, the string is normally
@w{@code{"syntax error"}}.

-@findex %define parse.error
+@findex %define parse.error verbose
If you invoke @samp{%define parse.error verbose} in the Bison declarations
section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
Bison provides a more verbose and specific error message string instead of
@@ -7352,13 +7389,76 @@ reported so far. Normally this variable is global; but 
if you
request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser})
then it is a local variable which only the actions can access.

+
+@node Syntax Error Reporting Function
+@subsection The Syntax Error Reporting Function @code{yyreport_syntax_error}
+
+@findex %define parse.error custom
+If you invoke @samp{%define parse.error custom} in the Bison declarations
+section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then
+the parser no longer passes syntax error messages to @code{yyerror}, rather
+it leaves that task to the user by calling the @code{yyreport_syntax_error}
+function.
+
+@deftypefun int yyreport_syntax_error (@code{const yyparse_context_t 
*}@var{ctx})
+Report a syntax error to the user. Return 0 on success, 2 on memory exhaustion.
+@end deftypefun
+
+Use the following functions to build the error message.
+
+@deftypefun {YYLTYPE *} yyparse_context_location (@code{const 
yyparse_context_t *}@var{ctx})
+The location of the syntax error.
+@end deftypefun
+
+
+@deftypefun int yysyntax_error_arguments (@code{const yyparse_context_t *}ctx, 
@code{int} @var{argv}@code{[]}, @code{int} @var{argc})
+Fill @var{argv} with first the internal number of the token that caused the
+error, then the internal numbers of the expected tokens. Never put more
+than @var{argc} elements into @var{argv}, and on success return the
+effective number of numbers stored in @var{argv}, which can be 0.
+
+If @var{argv} is null, return the size needed to store all the possible
+values, which is always less than @code{YYNTOKENS}. When LAC is enabled,
+may return -2 on memory exhaustion.
+@end deftypefun
+
+@deftypefun {const char *} yysymbol_name (@code{int} @var{symbol})
+The name of the symbol whose internal number is @var{symbol}, possibly
+translated. Must be called with valid symbol numbers.
+@end deftypefun
+
+A custom syntax error function looks as follows.
+
+@example
+int
+yyreport_syntax_error (const yyparse_context_t *ctx)
+@{
+ enum @{ ARGMAX = 10 @};
+ int arg[ARGMAX];
+ int n = yysyntax_error_arguments (ctx, arg, ARGMAX);
+ if (n == -2)
+ return 2;
+ fprintf (stderr, "syntax error");
+ for (int i = 1; i < n; ++i)
+ fprintf (stderr, " %s %s",
+ i == 1 ? "expected" : "or", yysymbol_name (arg[i]));
+ if (n)
+ fprintf (stderr, " before %s", yysymbol_name (arg[0]));
+ fprintf (stderr, "\n");
+ return 0;
+@}
+@end example
+
+You still must provide a @code{yyerror} function, used for instance to
+report memory exhaustion.
+
@node Action Features
@section Special Features for Use in Actions
@cindex summary, action features
@cindex action features summary

-Here is a table of Bison constructs, variables and macros that
-are useful in actions.
+Here is a table of Bison constructs, variables and macros that are useful in
+actions.

@deffn {Variable} $$
Acts like a variable that contains the semantic value for the
@@ -13929,8 +14029,7 @@ token is reset to the token that originally caused the 
violation.
@end deffn

@deffn {Directive} %error-verbose
-An obsolete directive standing for @samp{%define parse.error verbose}
-(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}).
+An obsolete directive standing for @samp{%define parse.error verbose}.
@end deffn

@deffn {Directive} %file-prefix "@var{prefix}"
@@ -14155,7 +14254,7 @@ instead.

@deffn {Function} yyerror
User-supplied function to be called by @code{yyparse} on error.
-@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+@xref{Error Reporting Function, ,The Error Reporting Function @code{yyerror}}.
@end deffn

@deffn {Macro} YYFPRINTF
@@ -14210,7 +14309,7 @@ Management}.
Global variable which Bison increments each time it reports a syntax error.
(In a pure parser, it is a local variable within @code{yyparse}. In a
pure push parser, it is a member of @code{yypstate}.)
-@xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}.
+@xref{Error Reporting Function, ,The Error Reporting Function @code{yyerror}}.
@end deffn

@deffn {Function} yyparse



reply via email to

[Prev in Thread] Current Thread [Next in Thread]