doc: updates for 3.6

bison-patches
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
doc: updates for 3.6

From:	Akim Demaille
Subject:	doc: updates for 3.6
Date:	Thu, 16 Apr 2020 08:45:28 +0200
   commit 5d983253f7106fe835953c7bbda15ca90247577b
   Author: Akim Demaille <address@hidden>
   Date:   Mon Apr 13 19:06:06 2020 +0200
      doc: updates for 3.6
      * doc/bison.texi: More s/token type/token kind/.
      * NEWS: Update.
   diff --git a/NEWS b/NEWS
   index b4e279cc..f794bfe6 100644
   --- a/NEWS
   +++ b/NEWS
   @@ -19,7 +19,7 @@ GNU Bison NEWS
   *** Improved syntax error messages
     Two new values for the %define parse.error variable offer more
   control to
   -  the user.
   +  the user.  Available in all the skeletons (C, C++, Java).
   **** %define parse.error detailed
   @@ -34,7 +34,12 @@ GNU Bison NEWS
   **** %define parse.error custom
     With this directive, the user forges and emits the syntax error
   message
   -  herself by defining a function such as:
   +  herself by defining the yyreport_syntax_error function.  A new type,
   +  yypcontext_t, captures the circumstances of the error, and provides
   the
   +  user with functions to get details, such as
   yypcontext_expected_tokens to
   +  get the list of expected token kinds.
   +
   +  A possible implementation of yyreport_syntax_error is:
       int
       yyreport_syntax_error (const yypcontext_t *ctx)
   @@ -86,35 +91,42 @@ GNU Bison NEWS
   *** List of expected tokens (yacc.c)
   -  At any point during parsing (including even before submitting the
   first
   -  token), push parsers may now invoke yypstate_expected_tokens to get
   the
   -  list of possible tokens.  This feature can be used to propose
   -  autocompletion (see below the "bistromathic" example).
   +  Push parsers may invoke yypstate_expected_tokens at any point during
   +  parsing (including even before submitting the first token) to get
   the list
   +  of possible tokens.  This feature can be used to propose
   autocompletion
   +  (see below the "bistromathic" example).
     It makes little sense to use this feature without enabling LAC
   (lookahead
     correction).
   *** Deep overhaul of the symbol and token kinds
   -  To avoid the confusion with typing in programming languages, we now
   refer
   -  to token and symbol "kinds" instead of token and symbol "types".
   +  To avoid the confusion with types in programming languages, we now
   refer
   +  to token and symbol "kinds" instead of token and symbol "types".
   The
   +  documentation and error messages have been revised.
   +
   +  All the skeletons have been updated to use dedicated enum types
   rather
   +  than integral types.  Special symbols are now regular citizens,
   instead of
   +  being declared in ad hoc ways.
   **** Token kinds
     The "token kind" is what is returned by the scanner, e.g., PLUS,
   NUMBER,
   -  LPAREN, etc.  Users are invited to replace their uses of "enum
   -  yytokentype" by "yytoken_kind_t".
   +  LPAREN, etc.  While backward compatibility is of course ensured,
   users are
   +  nonetheless invited to replace their uses of "enum yytokentype" by
   +  "yytoken_kind_t".
     This type now also includes tokens that were previously hidden: YYEOF
   (end
     of input), YYUNDEF (undefined token), and YYERRCODE (error token).
   They
   -  now have string aliases, internationalized if internationalization
   is
   +  now have string aliases, internationalized when internationalization
   is
     enabled.  Therefore, by default, error messages now refer to "end of
   file"
   -  (internationalized) rather than the cryptic "$end".
   +  (internationalized) rather than the cryptic "$end", or to "invaid
   token"
   +  rather than "$undefined".
   -  In most case, it is now useless to define the end-of-line token as
   -  follows:
   +  Therefore in most cases it is now useless to define the end-of-line
   token
   +  as follows:
   -    %token EOF 0  _("end of file")
   +    %token T_EOF 0 "end of file"
     Rather simply use "YYEOF" in your scanner.
   @@ -126,7 +138,9 @@ GNU Bison NEWS
     They are now exposed as a enum, "yysymbol_kind_t".
   -  This allows users to tailor the error messages the way they want.
   +  This allows users to tailor the error messages the way they want, or
   to
   +  process some symbols in a specific way in autocompletion (see the
   +  bistromathic example below).
   *** Modernize display of explanatory statements in diagnostics
   @@ -166,12 +180,18 @@ GNU Bison NEWS
     The lexcalc example (a simple example in C based on Flex and Bison)
   now
     also demonstrates location tracking.
   +
     A new C example, bistromathic, is a fully featured interactive
   calculator
     using many Bison features: pure interface, push parser,
   autocompletion
     based on the current parser state (using yypstate_expected_tokens),
     location tracking, internationalized custom error messages, lookahead
     correction, rich debug traces, etc.
   +  It shows how to depend on the symbol kinds to tailor autocompletion.
    For
   +  instance it recognizes the symbol kind "VARIABLE" to propose
   +  autocompletion on the existing variables, rather than of the word
   +  "variable".
   +
   * Noteworthy changes in release 3.5.4 (2020-04-05) [stable]
   ** WARNING: Future backward-incompatibilities!
   diff --git a/TODO b/TODO
   index 9555a621..80f01f10 100644
   --- a/TODO
   +++ b/TODO
   @@ -19,12 +19,11 @@
   - symbol.type_get should be kind_get, and it's not documented.
   - YYERRCODE and "end of file" and translation
   -*** The documentation
   -You can explicitly specify the numeric code for a token type...
   +** Java
   +*** Examples
   +Have an example with a push parser.  Use autocompletion in that case.
   -The token numbered as 0.
   -
   -** Java: calc.at
   +*** calc.at
   Stop hard-coding "Calc".  Adjust local.at (look for FIXME).
   ** doc
   diff --git a/doc/bison.texi b/doc/bison.texi
   index fadd5648..467799b4 100644
   --- a/doc/bison.texi
   +++ b/doc/bison.texi
   @@ -1232,7 +1232,7 @@ action in a GLR parser.
   @cindex GLR parsers and @code{yylval}
   @vindex yylloc
   @cindex GLR parsers and @code{yylloc}
   -In any semantic action, you can examine @code{yychar} to determine the
   type
   +In any semantic action, you can examine @code{yychar} to determine the
   kind
   of the lookahead token present at the time of the associated reduction.
   After checking that @code{yychar} is not set to @code{YYEMPTY} or
   @code{YYEOF}, you can then examine @code{yylval} and @code{yylloc} to
   @@ -1853,7 +1853,7 @@ for such a single-character token is the
   character itself.
   The return value of the lexical analyzer function is a numeric code
   which
   represents a token kind.  The same text used in Bison rules to stand
   for
   -this token kind is also a C expression for the numeric code for the
   type.
   +this token kind is also a C expression for the numeric code of the
   kind.
   This works in two ways.  If the token kind is a character literal, then
   its
   numeric code is that of the character; you can use the same character
   literal in the lexical analyzer to express the number.  If the token
   kind is
   @@ -2230,14 +2230,13 @@ the same as the declarations for the infix
   notation calculator.
   @end example
   @noindent
   -Note there are no declarations specific to locations.  Defining a data
   -type for storing locations is not needed: we will use the type
   provided
   -by default (@pxref{Location Type}), which is a
   -four member structure with the following integer fields:
   -@code{first_line}, @code{first_column}, @code{last_line} and
   -@code{last_column}.  By conventions, and in accordance with the GNU
   -Coding Standards and common practice, the line and column count both
   -start at 1.
   +Note there are no declarations specific to locations.  Defining a data
   type
   +for storing locations is not needed: we will use the type provided by
   +default (@pxref{Location Type}), which is a four member structure with
   the
   +following integer fields: @code{first_line}, @code{first_column},
   +@code{last_line} and @code{last_column}.  By conventions, and in
   accordance
   +with the GNU Coding Standards and common practice, the line and column
   count
   +both start at 1.
   @node Ltcalc Rules
   @subsection Grammar Rules for @code{ltcalc}
   @@ -2646,7 +2645,7 @@ By simply editing the initialization list and
   adding the necessary include
   files, you can add additional functions to the calculator.
   Two important functions allow look-up and installation of symbols in
   the
   -symbol table.  The function @code{putsym} is passed a name and the
   type
   +symbol table.  The function @code{putsym} is passed a name and the
   kind
   (@code{VAR} or @code{FUN}) of the object to be installed.  The object
   is
   linked to the front of the list, and a pointer to the object is
   returned.
   The function @code{getsym} is passed the name of the symbol to look up.
    If
   @@ -3698,10 +3697,9 @@ In a simple program it may be sufficient to use
   the same data type for
   the semantic values of all language constructs.  This was true in the
   RPN and infix calculator examples (@pxref{RPN Calc}).
   -Bison normally uses the type @code{int} for semantic values if your
   -program uses the same data type for all language constructs.  To
   -specify some other type, define the @code{%define} variable
   -@code{api.value.type} like this:
   +Bison normally uses the type @code{int} for semantic values if your
   program
   +uses the same data type for all language constructs.  To specify some
   other
   +type, define the @code{%define} variable @code{api.value.type} like
   this:
   @example
   %define api.value.type @{double@}
   @@ -4492,10 +4490,9 @@ Defining a data type for locations is much
   simpler than for semantic values,
   since all tokens and groupings always use the same type.
   You can specify the type of locations by defining a macro called
   -@code{YYLTYPE}, just as you can specify the semantic value type by
   -defining a @code{YYSTYPE} macro (@pxref{Value Type}).
   -When @code{YYLTYPE} is not defined, Bison uses a default structure
   type with
   -four members:
   +@code{YYLTYPE}, just as you can specify the semantic value type by
   defining
   +a @code{YYSTYPE} macro (@pxref{Value Type}).  When @code{YYLTYPE} is
   not
   +defined, Bison uses a default structure type with four members:
   @example
   typedef struct YYLTYPE
   @@ -7161,7 +7158,7 @@ yylex (void)
       return c;      /* Assume token kind for '+' is '+'. */
     @dots{}
     else
   -    return INT;    /* Return the type of the token. */
   +    return INT;    /* Return the kind of the token. */
     @dots{}
   @}
   @end example
   @@ -7211,7 +7208,7 @@ the type is @code{int} (the default), you might
   write this in @code{yylex}:
   @group
     @dots{}
     yylval = value;  /* Put value onto Bison stack. */
   -  return INT;      /* Return the type of the token. */
   +  return INT;      /* Return the kind of the token. */
     @dots{}
   @end group
   @end example
   @@ -7238,7 +7235,7 @@ then the code in @code{yylex} might look like
   this:
   @group
     @dots{}
     yylval.intval = value; /* Put value onto Bison stack. */
   -  return INT;            /* Return the type of the token. */
   +  return INT;            /* Return the kind of the token. */
     @dots{}
   @end group
   @end example
   @@ -7279,7 +7276,7 @@ yylex (YYSTYPE *lvalp, YYLTYPE *llocp)
   @{
     @dots{}
     *lvalp = value;  /* Put value onto Bison stack. */
   -  return INT;      /* Return the type of the token. */
   +  return INT;      /* Return the kind of the token. */
     @dots{}
   @}
   @end example
   @@ -8383,15 +8380,14 @@ represent the entire sequence of terminal and
   nonterminal symbols at or
   near the top of the stack.  The current state collects all the
   information
   about previous input which is relevant to deciding what to do next.
   -Each time a lookahead token is read, the current parser state together
   -with the type of lookahead token are looked up in a table.  This table
   -entry can say, ``Shift the lookahead token.''  In this case, it also
   -specifies the new parser state, which is pushed onto the top of the
   -parser stack.  Or it can say, ``Reduce using rule number @var{n}.''
   -This means that a certain number of tokens or groupings are taken off
   -the top of the stack, and replaced by one grouping.  In other words,
   -that number of states are popped from the stack, and one new state is
   -pushed.
   +Each time a lookahead token is read, the current parser state together
   with
   +the kind of lookahead token are looked up in a table.  This table
   entry can
   +say, ``Shift the lookahead token.''  In this case, it also specifies
   the new
   +parser state, which is pushed onto the top of the parser stack.  Or it
   can
   +say, ``Reduce using rule number @var{n}.''  This means that a certain
   number
   +of tokens or groupings are taken off the top of the stack, and
   replaced by
   +one grouping.  In other words, that number of states are popped from
   the
   +stack, and one new state is pushed.
   There is one other alternative: the table can say that the lookahead
   token
   is erroneous in the current state.  This causes error processing to
   begin
   @@ -11624,8 +11620,8 @@ particular it produces a genuine @code{union},
   which have a few specific
   features in C++.
   @itemize @minus
   @item
   -The type @code{YYSTYPE} is defined but its use is discouraged: rather
   -you should refer to the parser's encapsulated type
   +The type @code{YYSTYPE} is defined but its use is discouraged: rather
   you
   +should refer to the parser's encapsulated type
   @code{yy::parser::semantic_type}.
   @item
   Non POD (Plain Old Data) types cannot be used.  C++98 forbids any
   instance
[Prev in Thread]
Current Thread
[Next in Thread]
doc: updates for 3.6, Akim Demaille, 2020/04/16
- doc: updates for 3.6, Akim Demaille <=
Prev by Date: doc: updates for 3.6
Next by Date: Re: [PATCH 3/5] c++: improvements on symbol kinds
Previous by thread: doc: updates for 3.6
Index(es):
- Date
- Thread