bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RIP: c++: merge symbol_type and token


From: Akim Demaille
Subject: RIP: c++: merge symbol_type and token
Date: Sun, 23 Dec 2018 11:17:34 +0100

As I mentioned earlier, I would be very happy to merge parser::token
and parser::symbol_type together.  It works well, and I was very
happy about that.

But it's not so simple.  The problem is that then parser::token looks
like this:


    /// "External" symbols: returned by the scanner.
    struct token : basic_symbol<by_type>
    {
      /// Superclass.
      typedef basic_symbol<by_type> super_type;

      /// Empty symbol.
      token () {};

      /// Constructor for valueless symbols, and symbols from each type.
      token (int tok);
      token (int tok, int v);

      enum token_type
      {
        foo = 258,
        bar = 259
      };
    };

and then... I can have name clashes if the user defines the name
"token", but also all the names inherited from basic_symbol.

That's annoying.

So I don't think I can follow this track.

FTR, below the implementation patch comes a patch about the testsuite.
It gives an idea of how the usage would have been.




commit 1325d23f493c9c62ec7cb89444186a72ea76295c
Author: Akim Demaille <address@hidden>
Date:   Sat Dec 22 17:48:36 2018 +0100

    c++: merge symbol_type and token
    
    So far, we expose two related, but different, types: parser::token and
    parser::symbol_type.
    
    The enum that defines the token types is parser::token::yytokentype,
    parser::token serving only to avoid that the enum items "leak" into
    parser (when this was introduced, there were no enum classes in C++).
    
    The external symbols (returned by yylex) are instances of
    parser::symbol_type.
    
    Now that symbol_type has clean and documented constructors, it feels
    weird that parser::symbol_type and parser::token are two different
    classes.  This commit fuse them.
    
    There is one technical problem: symbol_type would then define
    yytokentype, yet its superclass (basic_symbol<by_type>) needs
    yytokentype (in by_type).  So actually define yytokentype in by_type
    and expose it in token/symbol_type.
    
    * data/c++.m4 (b4_token_enums): Define `token_type`, since that's the
    exposed name.
    To ensure backward compatibility, we will expose `yytokentype` as an
    alias.
    (b4_public_types_declare): No longer define `yytokentype`.
    (by_type): Define `token_type`.
    (symbol_type): Rename as...
    (token): this.
    (symbol_type): New alias, for backward compatibility.
    * data/lalr1.cc: Use `token` instead of `symbol_type`.
    * data/variant.hh: Likewise.
    
    * tests/c++.at: Make sure yy::parser::token::yytokentype is still
    visible.
    
    * data/glr.cc: Since b4_public_types_declare no longer defines token,
    do it by hand.

diff --git a/data/c++.m4 b/data/c++.m4
index 6a0604c0..aae28c88 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -162,7 +162,7 @@ 
m4_bpatsubst(m4_dquote(m4_bpatsubst(m4_dquote(b4_namespace_ref[ ]),
 # --------------
 # Output the definition of the tokens as enums.
 m4_define([b4_token_enums],
-[[enum yytokentype
+[[enum token_type
       {
         ]m4_join([,
         ],
@@ -218,15 +218,6 @@ m4_define([b4_public_types_declare],
       location_type location;])[
     };
 
-    /// Tokens.
-    struct token
-    {
-      ]b4_token_enums[
-    };
-
-    /// (External) token type, as returned by yylex.
-    typedef token::yytokentype token_type;
-
     /// Symbol type: an internal symbol number.
     typedef int symbol_number_type;
 
@@ -300,6 +291,8 @@ m4_define([b4_symbol_type_declare],
     /// Type access provider for token (enum) based symbols.
     struct by_type
     {
+      ]b4_token_enums[
+
       /// Default constructor.
       by_type ();
 
@@ -322,7 +315,7 @@ m4_define([b4_symbol_type_declare],
       /// \a empty when empty.
       symbol_number_type type_get () const YY_NOEXCEPT;
 
-      /// The token.
+      /// The external token number.
       token_type token () const YY_NOEXCEPT;
 
       /// The symbol type.
@@ -332,17 +325,26 @@ m4_define([b4_symbol_type_declare],
     };
 
     /// "External" symbols: returned by the scanner.
-    struct symbol_type : basic_symbol<by_type>
+    struct token : basic_symbol<by_type>
     {]b4_variant_if([[
       /// Superclass.
       typedef basic_symbol<by_type> super_type;
 
+      /// Backward compatibility.
+      typedef token_type yytokentype;
+
       /// Empty symbol.
-      symbol_type () {};
+      token () {};
 
       /// Constructor for valueless symbols, and symbols from each type.
 ]b4_type_foreach([_b4_token_constructor_declare])dnl
     ])[};
+
+    /// (External) token type, as returned by yylex.
+    typedef token::token_type token_type;
+
+    /// Backward compatible alias.
+    typedef token symbol_type;
 ]])
 
 
diff --git a/data/glr.cc b/data/glr.cc
index 0401b849..d183f003 100644
--- a/data/glr.cc
+++ b/data/glr.cc
@@ -267,6 +267,15 @@ b4_percent_code_get([[requires]])[
   class ]b4_parser_class_name[
   {
   public:
+    /// Tokens.
+    struct token
+    {
+      ]b4_token_enums[
+
+      /// Backward compatibility.
+      typedef token_type yytokentype;
+    };
+
 ]b4_public_types_declare[
 
     /// Build a parser object.
diff --git a/data/lalr1.cc b/data/lalr1.cc
index 21ec144f..36f01887 100644
--- a/data/lalr1.cc
+++ b/data/lalr1.cc
@@ -127,7 +127,7 @@ b4_dollar_popdef[]dnl
 m4_define([b4_lex],
 [b4_token_ctor_if(
 [b4_function_call([yylex],
-                  [symbol_type], m4_ifdef([b4_lex_param], b4_lex_param))],
+                  [token], m4_ifdef([b4_lex_param], b4_lex_param))],
 [b4_function_call([yylex], [int],
                   [b4_api_PREFIX[STYPE*], [&yyla.value]][]dnl
 b4_locations_if([, [[location*], [&yyla.location]]])dnl
@@ -230,7 +230,7 @@ m4_define([b4_shared_declarations],
     /// \param yystate   the state where the error occurred.
     /// \param yyla      the lookahead token.
     virtual std::string yysyntax_error_ (state_type yystate,
-                                         const symbol_type& yyla) const;
+                                         const token& yyla) const;
 
     /// Compute post-reduction state.
     /// \param yystate   the current state
@@ -331,7 +331,7 @@ m4_define([b4_shared_declarations],
       /// Move or copy construction.
       stack_symbol_type (YY_RVREF (stack_symbol_type) that);
       /// Steal the contents from \a sym to build this.
-      stack_symbol_type (state_type s, YY_MOVE_REF (symbol_type) sym);
+      stack_symbol_type (state_type s, YY_MOVE_REF (token) sym);
 #if YY_CPLUSPLUS < 201103L
       /// Assignment, needed by push_back by some old implementations.
       /// Moves the contents of that.
@@ -358,7 +358,7 @@ m4_define([b4_shared_declarations],
     /// \param s    the state
     /// \param sym  the symbol (for its value and location).
     /// \warning the contents of \a sym.value is stolen.
-    void yypush_ (const char* m, state_type s, YY_MOVE_REF (symbol_type) sym);
+    void yypush_ (const char* m, state_type s, YY_MOVE_REF (token) sym);
 
     /// Pop \a n symbols from the stack.
     void yypop_ (int n = 1);
@@ -614,7 +614,7 @@ m4_if(b4_prefix, [yy], [],
 #endif
   }
 
-  ]b4_parser_class_name[::stack_symbol_type::stack_symbol_type (state_type s, 
YY_MOVE_REF (symbol_type) that)
+  ]b4_parser_class_name[::stack_symbol_type::stack_symbol_type (state_type s, 
YY_MOVE_REF (token) that)
     : super_type (s]b4_variant_if([], [, YY_MOVE 
(that.value)])[]b4_locations_if([, YY_MOVE (that.location)])[)
   {]b4_variant_if([
     b4_symbol_variant([that.type_get ()],
@@ -679,12 +679,12 @@ m4_if(b4_prefix, [yy], [],
   }
 
   void
-  ]b4_parser_class_name[::yypush_ (const char* m, state_type s, YY_MOVE_REF 
(symbol_type) sym)
+  ]b4_parser_class_name[::yypush_ (const char* m, state_type s, YY_MOVE_REF 
(token) tok)
   {
 #if 201103L <= YY_CPLUSPLUS
-    yypush_ (m, stack_symbol_type (s, std::move (sym)));
+    yypush_ (m, stack_symbol_type (s, std::move (tok)));
 #else
-    stack_symbol_type ss (s, sym);
+    stack_symbol_type ss (s, tok);
     yypush_ (m, ss);
 #endif
   }
@@ -763,7 +763,7 @@ m4_if(b4_prefix, [yy], [],
     int yyerrstatus_ = 0;
 
     /// The lookahead symbol.
-    symbol_type yyla;]b4_locations_if([[
+    token yyla;]b4_locations_if([[
 
     /// The locations where the error started and ended.
     stack_symbol_type yyerror_range[3];]])[
@@ -819,7 +819,7 @@ b4_dollar_popdef])[]dnl
         try
 #endif // YY_EXCEPTIONS
           {]b4_token_ctor_if([[
-            symbol_type yylookahead (]b4_lex[);
+            token yylookahead (]b4_lex[);
             yyla.move (yylookahead);]], [[
             yyla.type = yytranslate_ (]b4_lex[);]])[
           }
@@ -1083,8 +1083,8 @@ b4_dollar_popdef])[]dnl
   // Generate an error message.
   std::string
   ]b4_parser_class_name[::yysyntax_error_ (]dnl
-b4_error_verbose_if([state_type yystate, const symbol_type& yyla],
-                    [state_type, const symbol_type&])[) const
+b4_error_verbose_if([state_type yystate, const token& yyla],
+                    [state_type, const token&])[) const
   {]b4_error_verbose_if([[
     // Number of reported tokens (one for the "unexpected", one per
     // "expected").
diff --git a/data/variant.hh b/data/variant.hh
index 545060e3..07f59315 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -352,14 +352,14 @@ m4_define([_b4_token_maker_declare],
 [b4_token_visible_if([$1],
 [#if 201103L <= YY_CPLUSPLUS
     static
-    symbol_type
+    token
     make_[]_b4_symbol([$1], [id]) (b4_join(
                b4_symbol_if([$1], [has_type],
                [b4_symbol([$1], [type]) v]),
                b4_locations_if([location_type l])));
 #else
     static
-    symbol_type
+    token
     make_[]_b4_symbol([$1], [id]) (b4_join(
                b4_symbol_if([$1], [has_type],
                [const b4_symbol([$1], [type])& v]),
@@ -375,13 +375,13 @@ m4_define([_b4_token_maker_declare],
 m4_define([_b4_token_constructor_declare],
 [m4_ifval(_b4_includes_tokens($@),
 [#if 201103L <= YY_CPLUSPLUS
-    symbol_type (b4_join(
+      token (b4_join(
         [int tok],
         b4_symbol_if([$1], [has_type],
                      [b4_symbol([$1], [type]) v]),
         b4_locations_if([location_type l])));
 #else
-    symbol_type (b4_join(
+      token (b4_join(
         [int tok],
         b4_symbol_if([$1], [has_type],
                      [const b4_symbol([$1], [type])& v]),
@@ -406,27 +406,27 @@ m4_define([_b4_token_maker_define],
 [b4_token_visible_if([$1],
 [#if 201103L <= YY_CPLUSPLUS
   inline
-  b4_parser_class_name::symbol_type
+  b4_parser_class_name::token
   b4_parser_class_name::make_[]_b4_symbol([$1], [id]) (b4_join(
                      b4_symbol_if([$1], [has_type],
                      [b4_symbol([$1], [type]) v]),
                      b4_locations_if([location_type l])))
   {
-    return symbol_type (b4_join([token::b4_symbol([$1], [id])],
-                                b4_symbol_if([$1], [has_type], [std::move 
(v)]),
-                                b4_locations_if([std::move (l)])));
+    return {b4_join([token::b4_symbol([$1], [id])],
+                    b4_symbol_if([$1], [has_type], [std::move (v)]),
+                    b4_locations_if([std::move (l)]))};
   }
 #else
   inline
-  b4_parser_class_name::symbol_type
+  b4_parser_class_name::token
   b4_parser_class_name::make_[]_b4_symbol([$1], [id]) (b4_join(
                      b4_symbol_if([$1], [has_type],
                      [const b4_symbol([$1], [type])& v]),
                      b4_locations_if([const location_type& l])))
   {
-    return symbol_type (b4_join([token::b4_symbol([$1], [id])],
-                                b4_symbol_if([$1], [has_type], [v]),
-                                b4_locations_if([l])));
+    return token (b4_join([token::b4_symbol([$1], [id])],
+                          b4_symbol_if([$1], [has_type], [v]),
+                          b4_locations_if([l])));
   }
 #endif
 ])])
@@ -446,7 +446,7 @@ m4_define([_b4_token_constructor_define],
 [m4_ifval(_b4_includes_tokens($@),
 [[#if 201103L <= YY_CPLUSPLUS
   inline
-  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+  ]b4_parser_class_name[::token::token (]b4_join(
         [int tok],
         b4_symbol_if([$1], [has_type],
                      [b4_symbol([$1], [type]) v]),
@@ -459,7 +459,7 @@ m4_define([_b4_token_constructor_define],
   }
 #else
   inline
-  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+  ]b4_parser_class_name[::token::token (]b4_join(
         [int tok],
         b4_symbol_if([$1], [has_type],
                      [const b4_symbol([$1], [type])& v]),
diff --git a/tests/c++.at b/tests/c++.at
index ee2e30cd..85c1fb6d 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -200,6 +200,7 @@ AT_PARSER_CHECK([./list], 0, [],
 AT_BISON_OPTION_POPDEFS
 AT_CLEANUP
 
+
 ## --------------------------------------------------- ##
 ## Multiple occurrences of $n and api.value.automove.  ##
 ## --------------------------------------------------- ##
@@ -1315,7 +1316,13 @@ int yylex (yy::parser::semantic_type *lvalp)
   // bug with a macro that erroneously expanded this identifier to
   // yystackp->yyval.
   YYUSE (lvalp);
-  return yy::parser::token::ZERO;
+
+  // Check that yy::parser::token::yytokentype works.  It was never documented,
+  // but it appears people have been depending on it, instead of using
+  // yy::parser::token_type.
+  yy::parser::token::yytokentype res = yy::parser::token::ZERO;
+
+  return res;
 }
 
 void yy::parser::error (std::string const&)















commit 3d0688ac6831984ecfa3d75eddb05d31a09d8c98
Author: Akim Demaille <address@hidden>
Date:   Sun Dec 23 10:11:47 2018 +0100

    WIP: tests: move to token instead of symbol_type.
    
    * #: .
    * #: .

diff --git a/tests/c++.at b/tests/c++.at
index 85c1fb6d..49e48693 100644
--- a/tests/c++.at
+++ b/tests/c++.at
@@ -145,10 +145,10 @@ exp: "int" { $$.push_back ($1); }
 int main()
 {
   using yy::parser;
-  // symbol_type: construction, accessor.
+  // token: construction, accessor.
   {
-    parser::symbol_type s = parser::make_INT(12);
-    std::cerr << s.value.as<int>() << '\n';
+    parser::token t = parser::make_INT(12);
+    std::cerr << t.value.as<int>() << '\n';
   }
 
   // stack_symbol_type: construction, accessor.
@@ -156,8 +156,8 @@ int main()
 #if defined __cplusplus && 201103L <= __cplusplus
     auto ss = parser::stack_symbol_type(1, parser::make_INT(123));
 #else
-    parser::symbol_type s = parser::make_INT(123);
-    parser::stack_symbol_type ss(1, s);
+    parser::token t = parser::make_INT(123);
+    parser::stack_symbol_type ss(1, t);
 #endif
     std::cerr << ss.value.as<int>() << '\n';
   }
@@ -175,8 +175,8 @@ int main()
         st.push(parser::stack_symbol_type{int_reduction_state,
                                           parser::make_INT (i)});
 #else
-        parser::symbol_type s = parser::make_INT (i);
-        parser::stack_symbol_type ss (int_reduction_state, s);
+        parser::token t = parser::make_INT (i);
+        parser::stack_symbol_type ss (int_reduction_state, t);
         st.push (ss);
 #endif
       }
@@ -568,7 +568,7 @@ AT_DATA_GRAMMAR([[input.y]],
   #include <iostream>
   namespace yy
   {
-    static yy::parser::symbol_type yylex();
+    static yy::parser::token yylex();
   }
 }
 
@@ -590,7 +590,7 @@ expr:
 %%
 namespace yy
 {
-  parser::symbol_type yylex()
+  parser::token yylex()
   {
     static int loc = 0;
     switch (loc++)
diff --git a/tests/local.at b/tests/local.at
index 146ed47b..67ad7755 100644
--- a/tests/local.at
+++ b/tests/local.at
@@ -252,7 +252,7 @@ AT_TOKEN_CTOR_IF(
 [m4_pushdef([AT_LOC], [[(]AT_NAME_PREFIX[lloc)]])
  m4_pushdef([AT_VAL], [[(]AT_NAME_PREFIX[lval)]])
  m4_pushdef([AT_YYLEX_FORMALS],     [])
- m4_pushdef([AT_YYLEX_RETURN],      [yy::parser::symbol_type])
+ m4_pushdef([AT_YYLEX_RETURN],      [yy::parser::token])
  m4_pushdef([AT_YYLEX_ARGS],        [])
  m4_pushdef([AT_USE_LEX_ARGS],      [])
  m4_pushdef([AT_YYLEX_PRE_FORMALS], [])
diff --git a/tests/types.at b/tests/types.at
index e41c21b1..0ac03843 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -299,11 +299,11 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc], 
[glr.cc]],
                           << $2.first << ':' << $2.second << '\n';
               }],
             ["12"],
-            [[typedef yy::parser::symbol_type symbol;
+            [[typedef yy::parser::token token;
              if (res)
-               return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+               return token (res, std::make_pair (res - '0', res - '0' + 1));
              else
-               return symbol (res)]],
+               return token (res)]],
             [1:2, 2:3])
 
     # Move-only types, and variadic emplace.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]