bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dynamic token kinds


From: Akim Demaille
Subject: Re: Dynamic token kinds
Date: Sat, 22 Dec 2018 14:59:29 +0100

Hi Frank!

> Le 22 déc. 2018 à 01:14, Frank Heckenbach <address@hidden> a écrit :
> 
> Akim Demaille wrote:
> 
>> I like this idea.  I have a draft for it in my repo, as "make-symbol".
>> Please, try it and report about it.
> 
> Again, sorry for the delay (still busy), but now I tried it
> (removing the "b4_parse_assert_if", see below).

Thanks for spending time on this.  I really need feedback  :)


> It seems to work for me. The only issue I had was due to sloppiness
> on my side. I'm only mentioning it in case others do the same.
> Basically, I had stored tokens of one specific semantic type in a
> look-up table together with tokens without semantic type (storing a
> dummy value in the table for the latter), and constructed the tokens
> for both in the same branch, exploiting the only case where a
> mismatch is inconsequential, i.e. setting a value and not using it.
> This worked before, but the stricter checks now (correctly) caught
> it.

Great news!   It works!


> To actually allow this, you could have the typed constructors all
> accept the typeless tokens as well, but I don't consider that really
> necessary. Unless you want to support that for backward (bugward?)
> compatibility, I'll just change my code to make two separate
> make_symbol calls.

Yes, I prefer it this way.  The whole point of my work on C++'s
symbols so far is really to be type safe(r).


>> There are a few issues:
>> - make_symbol will collide if the user has a token named symbol
>>  Any idea of a better name?  
> 
> To avoid such collisions, I think we have to avoid the "make_"
> prefix entirely. Maybe "build_symbol"?
> 
>> Or simply make them actual constructors for `symbol_type`.
> 
> Yes, if they are (documented as) public. I think I'd prefer this as
> I wouldn't have to change my code from 3.2.2.

See below, I have a working draft that completely replaces
make_symbol by "merging" the assert-based type checking into
the symbol_type constructors.  Since that makes the ctors safe,
I'm fine with exposing them.

I wish it required less changes.  In particular,
it tears appart symbol_type and stack_symbol_type even further
apart.  My CRTP might no longer fully make sense, maybe I'll
get rid of it at some point.


>> - In the signature of make_symbol, I've had to use int for the
>>  token type, instead of the enum token_type, and then
>>  to convert the int into token_type.  I don't like that, but I
>>  probably don't have much of a choice.  (A template would be
>>  overkill IMHO).
> 
> Why, is it because of char tokens (like '+' in your example)?

Yes, exactly.  I don't like that we accept ASCII, but we have
to.


Here is the patch, twice.  I want to keep the previous one
(with make_symbol) in the git history, so the second patch below
shows that actual commit, relatively to make_symbol.  But I think
the patch compared to before make_symbol is a better reading (it's
the first one below).


Now that I have done this, I think I can merge two different
types that currently exist in yy::parser: token and symbol_type.
The first one is used _only_ to define the enum of the various
token types, and the second one, well, implements the tokens.  And
'token' is actually a better name than 'symbol_type'.
Of course, I will leave a type alias for symbol_type.

WDYT?

Currently in https://github.com/akimd/bison/tree/make-symbol.

commit 5e8571708e34b3e69a8182a88057199e7bf63568
Author: Akim Demaille <address@hidden>
Date:   Wed Dec 19 17:51:10 2018 +0100

    c++: exhibit a safe symbol_type
    
    Instead of introducing make_symbol (whose name, btw, somewhat
    infringes on the user's "name space", if she defines a token named
    "symbol"), let's make the construction of symbol_type safer, using
    assertions.
    
    For instance with:
    
      %token ':' <std::string> ID <int> INT;
    
    generate:
    
        symbol_type (int token, const std::string&);
        symbol_type (int token, const int&);
        symbol_type (int token);
    
    It does mean that now named token constructors (make_ID, make_INT,
    etc.) go through a useless assert, but I think we can ignore this: I
    assume any decent compiler will inline the symbol_type ctor inside the
    make_TOKEN functions, which will show that the assert is trivially
    verified, hence I expect no code will be emitted for it.  And anyway,
    that's an assert, NDEBUG controls it.
    
    * data/c++.m4 (symbol_type): Turn into a subclass of
    basic_symbol<by_type>.
    Declare symbol constructors when variants are enabled.
    * data/variant.hh (_b4_type_constructor_declare)
    (_b4_type_constructor_define): Replace with...
    (_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
    Generate symbol_type constructors.
    * doc/bison.texi (Complete Symbols): Document.
    * tests/types.at: Check.

diff --git a/NEWS b/NEWS
index 08d99f19..c67fb142 100644
--- a/NEWS
+++ b/NEWS
@@ -96,10 +96,36 @@ GNU Bison NEWS
   until it sees the '='.  So we notate the two possible reductions to
   indicate that each conflicts in one rule.
 
+*** C++: Actual token constructors
+
+  When variants and token constructors are enabled, in addition to the
+  type-safe named token constructors (make_ID, amke_INT, etc.), we now
+  generate genuine constructors for symbol_type.
+
+  For instance with these declarations
+
+    %token           ':'
+       <std::string> ID
+       <int>         INT;
+
+  you may use these constructors:
+
+    symbol_type (int token, const std::string&);
+    symbol_type (int token, const int&);
+    symbol_type (int token);
+
+  which should be used in a Flex-scanner as follows.
+
+    %%
+    [a-z]+   return yy::parser::symbol_type (ID, yytext);
+    [0-9]+   return yy::parser::symbol_type (INT, text_to_int (yytext);
+    ":"      return yy::parser::symbol_type (’:’);
+    <<EOF>>  return yy::parser::symbol_type (0);
+
 *** C++: Variadic emplace
 
-  If your application requires C++11, you may now use a variadic emplace for
-  semantic values:
+  If your application requires C++11 and you don't use symbol constructors,
+  you may now use a variadic emplace for semantic values:
 
     %define api.value.type variant
     %token <std::pair<int, int>> PAIR
diff --git a/data/c++.m4 b/data/c++.m4
index b4f56add..eb5c47f0 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -332,7 +332,17 @@ m4_define([b4_symbol_type_declare],
     };
 
     /// "External" symbols: returned by the scanner.
-    typedef basic_symbol<by_type> symbol_type;
+    struct symbol_type : basic_symbol<by_type>
+    {]b4_variant_if([[
+      /// Superclass.
+      typedef basic_symbol<by_type> super_type;
+
+      /// Empty symbol.
+      symbol_type () {};
+
+      /// Constructor for valueless symbols, and symbols from each type.
+]b4_type_foreach([_b4_symbol_constructor_declare])[
+    ]])[};
 ]])
 
 
diff --git a/data/variant.hh b/data/variant.hh
index 836616a6..4e036d1e 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -335,6 +335,16 @@ m4_define([b4_symbol_value_template],
 ## ------------- ##
 
 
+# _b4_includes_tokens(SYMBOL-NUM...)
+# ----------------------------------
+# Expands to non-empty iff one of the SYMBOL-NUM denotes
+# a token.
+m4_define([_b4_is_token],
+          [b4_symbol_if([$1], [is_token], [1])])
+m4_define([_b4_includes_tokens],
+          [m4_map([_b4_is_token], address@hidden)])
+
+
 # _b4_token_maker_declare(SYMBOL-NUM)
 # -----------------------------------
 # Declare make_SYMBOL for SYMBOL-NUM.  Use at class-level.
@@ -358,10 +368,31 @@ m4_define([_b4_token_maker_declare],
 ])])
 
 
+# _b4_symbol_constructor_declare(SYMBOL-NUM...)
+# ---------------------------------------------
+# Declare a unique make_symbol for all the SYMBOL-NUM (they
+# have the same type).  Use at class-level.
+m4_define([_b4_symbol_constructor_declare],
+[m4_ifval(_b4_includes_tokens($@),
+[#if 201103L <= YY_CPLUSPLUS
+    symbol_type (b4_join(
+        [int tok],
+        b4_symbol_if([$1], [has_type],
+                     [b4_symbol([$1], [type]) v]),
+        b4_locations_if([location_type l])));
+#else
+    symbol_type (b4_join(
+        [int tok],
+        b4_symbol_if([$1], [has_type],
+                     [const b4_symbol([$1], [type])& v]),
+        b4_locations_if([const location_type& l])));
+#endif
+])])
+
+
 # b4_symbol_constructor_declare
 # -----------------------------
-# Declare symbol constructors for all the value types.
-# Use at class-level.
+# Declare symbol constructors.  Use at class-level.
 m4_define([b4_symbol_constructor_declare],
 [    // Symbol constructors declarations.
 b4_symbol_foreach([_b4_token_maker_declare])])
@@ -401,6 +432,48 @@ m4_define([_b4_token_maker_define],
 ])])
 
 
+# _b4_symbol_constructor_define(SYMBOL-NUM...)
+# --------------------------------------------
+# Declare a unique make_symbol for all the SYMBOL-NUM (they
+# have the same type).  Use at class-level.
+m4_define([_b4_type_clause],
+[b4_symbol_if([$1], [is_token],
+              [b4_symbol_if([$1], [has_id],
+                            [tok == token::b4_symbol([$1], [id])],
+                            [tok == b4_symbol([$1], [user_number])])])])
+
+m4_define([_b4_symbol_constructor_define],
+[m4_ifval(_b4_includes_tokens($@),
+[[#if 201103L <= YY_CPLUSPLUS
+  inline
+  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+        [int tok],
+        b4_symbol_if([$1], [has_type],
+                     [b4_symbol([$1], [type]) v]),
+        b4_locations_if([location_type l]))[)
+    : super_type(]b4_join([token_type (tok)],
+                          b4_symbol_if([$1], [has_type], [std::move (v)]),
+                          b4_locations_if([std::move (l)]))[)
+  {
+    YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden))[);
+  }
+#else
+  inline
+  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+        [int tok],
+        b4_symbol_if([$1], [has_type],
+                     [const b4_symbol([$1], [type])& v]),
+        b4_locations_if([const location_type& l]))[)
+    : super_type(]b4_join([token_type (tok)],
+                          b4_symbol_if([$1], [has_type], [v]),
+                          b4_locations_if([l]))[)
+  {
+    YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden))[);
+  }
+#endif
+]])])
+
+
 # b4_basic_symbol_constructor_declare(SYMBOL-NUM)
 # -----------------------------------------------
 # Generate a constructor declaration for basic_symbol from given type.
@@ -452,4 +525,5 @@ m4_define([b4_basic_symbol_constructor_define],
 # Define the overloaded versions of make_symbol for all the value types.
 m4_define([b4_symbol_constructor_define],
 [  // Implementation of make_symbol for each symbol type.
+b4_type_foreach([_b4_symbol_constructor_define])
 b4_symbol_foreach([_b4_token_maker_define])])
diff --git a/doc/bison.texi b/doc/bison.texi
index 89283be7..e1a5aaba 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -11500,6 +11500,57 @@ additional arguments.
 
 For each token type, Bison generates named constructors as follows.
 
address@hidden  {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const @var{value_type}& @var{value}, const location_type& 
@var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const location_type& @var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const @var{value_type}& @var{value})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token})
+Build a complete terminal symbol for the token type @var{token} (including
+the @code{api.token.prefix}), whose semantic value, if it has one, is
address@hidden of adequate @var{value_type}.  Pass the @var{location} iff
+location tracking is enabled.
+
+Consistency between @var{token} and @var{value_type} is checked via an
address@hidden
address@hidden deftypeop
+
+For instance, given the following declarations:
+
address@hidden
+%define api.token.prefix @address@hidden
+%token <std::string> IDENTIFIER;
+%token <int> INTEGER;
+%token ':';
address@hidden example
+
address@hidden
+you may use these constructors:
+
address@hidden
+symbol_type (int token, const std::string&, const location_type&);
+symbol_type (int token, const int&, const location_type&);
+symbol_type (int token, const location_type&);
address@hidden example
+
address@hidden
+which should be used in a Flex-scanner as follows.
+
address@hidden
+%%
+[a-z]+   return yy::parser::symbol_type (TOK_IDENTIFIER, yytext, loc);
+[0-9]+   return yy::parser::symbol_type (TOK_INTEGER, text_to_int (yytext), 
loc);
+":"      return yy::parser::symbol_type (':', loc);
+<<EOF>>  return yy::parser::symbol_type (0, loc);
address@hidden example
+
address@hidden 1
+
+Note that it is possible to generate and compile type incorrect code
+(e.g. @samp{symbol_type (':', yytext, loc)}).  It will fail at run time,
+provided the assertions are enabled (i.e., @option{-DNDEBUG} was not passed
+to the compiler).  Bison supports an alternative that guarantees that type
+incorrect code will not even compile.  Indeed, it generates @emph{named
+constructors} as follows.
+
 @deftypemethod {parser} {symbol_type} address@hidden (const @var{value_type}& 
@var{value}, const location_type& @var{location})
 @deftypemethodx {parser} {symbol_type} address@hidden (const location_type& 
@var{location})
 @deftypemethodx {parser} {symbol_type} address@hidden (const @var{value_type}& 
@var{value})
@@ -11531,7 +11582,7 @@ symbol_type make_EOF (const location_type&);
 @end example
 
 @noindent
-which should be used in a Flex-scanner as follows.
+which should be used in a scanner as follows.
 
 @example
 [a-z]+   return yy::parser::make_IDENTIFIER (yytext, loc);
@@ -11544,6 +11595,7 @@ Tokens that do not have an identifier are not 
accessible: you cannot simply
 use characters such as @code{':'}, they must be declared with @code{%token},
 including the end-of-file token.
 
+
 @node A Complete C++ Example
 @subsection A Complete C++ Example
 
diff --git a/tests/types.at b/tests/types.at
index 2924ec18..e41c21b1 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -288,6 +288,24 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc], 
[glr.cc]],
                AT_VAL.build (std::pair<std::string, std::string> ("two", 
"deux"));],
             [10:11, two:deux])
 
+    # Type-based token constructors on move-only types, and types with commas.
+    AT_TEST([%skeleton "]b4_skel["
+             %define api.value.type variant
+             %define api.token.constructor],
+            [[%token <std::pair<int, int>> '1' '2';]],
+            ['1' '2'
+              {
+                std::cout << $1.first << ':' << $1.second << ", "
+                          << $2.first << ':' << $2.second << '\n';
+              }],
+            ["12"],
+            [[typedef yy::parser::symbol_type symbol;
+             if (res)
+               return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+             else
+               return symbol (res)]],
+            [1:2, 2:3])
+
     # Move-only types, and variadic emplace.
     AT_TEST([%skeleton "]b4_skel["
              %code requires { #include <memory> }
@@ -325,6 +343,25 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc], 
[glr.cc]],
             [10, 21:22],
             [AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
 
+    # Type-based token constructors on move-only types, and types with commas.
+    AT_TEST([%skeleton "]b4_skel["
+             %code requires { #include <memory> }
+             %define api.value.type variant
+             %define api.token.constructor],
+            [[%token <std::unique_ptr<int>> '1';
+             %token <std::pair<int, int>> '2';]],
+            ['1' '2' { std::cout << *$1 << ", "
+                                 << $2.first << ':' << $2.second << '\n'; }],
+            ["12"],
+            [[if (res == '1')
+               return {res, std::make_unique<int> (10)};
+             else if (res == '2')
+               return {res, std::make_pair (21, 22)};
+             else
+               return res]],
+            [10, 21:22],
+            [AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
+
   ])
 ])
 










commit 5e8571708e34b3e69a8182a88057199e7bf63568
Author: Akim Demaille <address@hidden>
Date:   Wed Dec 19 17:51:10 2018 +0100

    c++: exhibit a safe symbol_type
    
    Instead of introducing make_symbol (whose name, btw, somewhat
    infringes on the user's "name space", if she defines a token named
    "symbol"), let's make the construction of symbol_type safer, using
    assertions.
    
    For instance with:
    
      %token ':' <std::string> ID <int> INT;
    
    generate:
    
        symbol_type (int token, const std::string&);
        symbol_type (int token, const int&);
        symbol_type (int token);
    
    It does mean that now named token constructors (make_ID, make_INT,
    etc.) go through a useless assert, but I think we can ignore this: I
    assume any decent compiler will inline the symbol_type ctor inside the
    make_TOKEN functions, which will show that the assert is trivially
    verified, hence I expect no code will be emitted for it.  And anyway,
    that's an assert, NDEBUG controls it.
    
    * data/c++.m4 (symbol_type): Turn into a subclass of
    basic_symbol<by_type>.
    Declare symbol constructors when variants are enabled.
    * data/variant.hh (_b4_type_constructor_declare)
    (_b4_type_constructor_define): Replace with...
    (_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
    Generate symbol_type constructors.
    * doc/bison.texi (Complete Symbols): Document.
    * tests/types.at: Check.

diff --git a/NEWS b/NEWS
index 08d99f19..c67fb142 100644
--- a/NEWS
+++ b/NEWS
@@ -96,10 +96,36 @@ GNU Bison NEWS
   until it sees the '='.  So we notate the two possible reductions to
   indicate that each conflicts in one rule.
 
+*** C++: Actual token constructors
+
+  When variants and token constructors are enabled, in addition to the
+  type-safe named token constructors (make_ID, amke_INT, etc.), we now
+  generate genuine constructors for symbol_type.
+
+  For instance with these declarations
+
+    %token           ':'
+       <std::string> ID
+       <int>         INT;
+
+  you may use these constructors:
+
+    symbol_type (int token, const std::string&);
+    symbol_type (int token, const int&);
+    symbol_type (int token);
+
+  which should be used in a Flex-scanner as follows.
+
+    %%
+    [a-z]+   return yy::parser::symbol_type (ID, yytext);
+    [0-9]+   return yy::parser::symbol_type (INT, text_to_int (yytext);
+    ":"      return yy::parser::symbol_type (’:’);
+    <<EOF>>  return yy::parser::symbol_type (0);
+
 *** C++: Variadic emplace
 
-  If your application requires C++11, you may now use a variadic emplace for
-  semantic values:
+  If your application requires C++11 and you don't use symbol constructors,
+  you may now use a variadic emplace for semantic values:
 
     %define api.value.type variant
     %token <std::pair<int, int>> PAIR
diff --git a/data/c++.m4 b/data/c++.m4
index b4f56add..eb5c47f0 100644
--- a/data/c++.m4
+++ b/data/c++.m4
@@ -332,7 +332,17 @@ m4_define([b4_symbol_type_declare],
     };
 
     /// "External" symbols: returned by the scanner.
-    typedef basic_symbol<by_type> symbol_type;
+    struct symbol_type : basic_symbol<by_type>
+    {]b4_variant_if([[
+      /// Superclass.
+      typedef basic_symbol<by_type> super_type;
+
+      /// Empty symbol.
+      symbol_type () {};
+
+      /// Constructor for valueless symbols, and symbols from each type.
+]b4_type_foreach([_b4_symbol_constructor_declare])[
+    ]])[};
 ]])
 
 
diff --git a/data/variant.hh b/data/variant.hh
index 22832248..4e036d1e 100644
--- a/data/variant.hh
+++ b/data/variant.hh
@@ -368,25 +368,21 @@ m4_define([_b4_token_maker_declare],
 ])])
 
 
-# _b4_type_constructor_declare(SYMBOL-NUM...)
-# -------------------------------------------
+# _b4_symbol_constructor_declare(SYMBOL-NUM...)
+# ---------------------------------------------
 # Declare a unique make_symbol for all the SYMBOL-NUM (they
 # have the same type).  Use at class-level.
-m4_define([_b4_type_constructor_declare],
+m4_define([_b4_symbol_constructor_declare],
 [m4_ifval(_b4_includes_tokens($@),
 [#if 201103L <= YY_CPLUSPLUS
-    static
-    symbol_type
-    make_symbol (dnl
-b4_join([int tok],
+    symbol_type (b4_join(
+        [int tok],
         b4_symbol_if([$1], [has_type],
                      [b4_symbol([$1], [type]) v]),
         b4_locations_if([location_type l])));
 #else
-    static
-    symbol_type
-    make_symbol (dnl
-b4_join([int tok],
+    symbol_type (b4_join(
+        [int tok],
         b4_symbol_if([$1], [has_type],
                      [const b4_symbol([$1], [type])& v]),
         b4_locations_if([const location_type& l])));
@@ -399,7 +395,6 @@ b4_join([int tok],
 # Declare symbol constructors.  Use at class-level.
 m4_define([b4_symbol_constructor_declare],
 [    // Symbol constructors declarations.
-b4_type_foreach([_b4_type_constructor_declare])
 b4_symbol_foreach([_b4_token_maker_declare])])
 
 
@@ -437,8 +432,8 @@ m4_define([_b4_token_maker_define],
 ])])
 
 
-# _b4_type_constructor_define(SYMBOL-NUM...)
-# ------------------------------------------
+# _b4_symbol_constructor_define(SYMBOL-NUM...)
+# --------------------------------------------
 # Declare a unique make_symbol for all the SYMBOL-NUM (they
 # have the same type).  Use at class-level.
 m4_define([_b4_type_clause],
@@ -447,38 +442,36 @@ m4_define([_b4_type_clause],
                             [tok == token::b4_symbol([$1], [id])],
                             [tok == b4_symbol([$1], [user_number])])])])
 
-m4_define([_b4_type_constructor_define],
+m4_define([_b4_symbol_constructor_define],
 [m4_ifval(_b4_includes_tokens($@),
-[#if 201103L <= YY_CPLUSPLUS
+[[#if 201103L <= YY_CPLUSPLUS
   inline
-  b4_parser_class_name::symbol_type
-  b4_parser_class_name::make_symbol (dnl
-b4_join([int tok],
+  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+        [int tok],
         b4_symbol_if([$1], [has_type],
                      [b4_symbol([$1], [type]) v]),
-        b4_locations_if([location_type l])))
-  {b4_parse_assert_if([
-    assert (m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden)));])[
-    return symbol_type (]b4_join([token_type (tok)],
-                                b4_symbol_if([$1], [has_type], [std::move 
(v)]),
-                                b4_locations_if([std::move (l)])));
+        b4_locations_if([location_type l]))[)
+    : super_type(]b4_join([token_type (tok)],
+                          b4_symbol_if([$1], [has_type], [std::move (v)]),
+                          b4_locations_if([std::move (l)]))[)
+  {
+    YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden))[);
   }
 #else
   inline
-  b4_parser_class_name::symbol_type
-  b4_parser_class_name::make_symbol (dnl
-b4_join([int tok],
+  ]b4_parser_class_name[::symbol_type::symbol_type (]b4_join(
+        [int tok],
         b4_symbol_if([$1], [has_type],
                      [const b4_symbol([$1], [type])& v]),
-        b4_locations_if([const location_type& l])))
-  {b4_parse_assert_if([
-    assert (m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden)));])[
-    return symbol_type (]b4_join([token_type (tok)],
-                                b4_symbol_if([$1], [has_type], [v]),
-                                b4_locations_if([l])));
+        b4_locations_if([const location_type& l]))[)
+    : super_type(]b4_join([token_type (tok)],
+                          b4_symbol_if([$1], [has_type], [v]),
+                          b4_locations_if([l]))[)
+  {
+    YYASSERT (]m4_join([ || ], m4_map_sep([_b4_type_clause], [, ], 
address@hidden))[);
   }
 #endif
-])])
+]])])
 
 
 # b4_basic_symbol_constructor_declare(SYMBOL-NUM)
@@ -532,5 +525,5 @@ m4_define([b4_basic_symbol_constructor_define],
 # Define the overloaded versions of make_symbol for all the value types.
 m4_define([b4_symbol_constructor_define],
 [  // Implementation of make_symbol for each symbol type.
-b4_type_foreach([_b4_type_constructor_define])
+b4_type_foreach([_b4_symbol_constructor_define])
 b4_symbol_foreach([_b4_token_maker_define])])
diff --git a/doc/bison.texi b/doc/bison.texi
index 89283be7..e1a5aaba 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -11500,6 +11500,57 @@ additional arguments.
 
 For each token type, Bison generates named constructors as follows.
 
address@hidden  {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const @var{value_type}& @var{value}, const location_type& 
@var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const location_type& @var{location})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token}, const @var{value_type}& @var{value})
address@hidden {Constructor} {parser::symbol_type} {} {symbol_type} (int 
@var{token})
+Build a complete terminal symbol for the token type @var{token} (including
+the @code{api.token.prefix}), whose semantic value, if it has one, is
address@hidden of adequate @var{value_type}.  Pass the @var{location} iff
+location tracking is enabled.
+
+Consistency between @var{token} and @var{value_type} is checked via an
address@hidden
address@hidden deftypeop
+
+For instance, given the following declarations:
+
address@hidden
+%define api.token.prefix @address@hidden
+%token <std::string> IDENTIFIER;
+%token <int> INTEGER;
+%token ':';
address@hidden example
+
address@hidden
+you may use these constructors:
+
address@hidden
+symbol_type (int token, const std::string&, const location_type&);
+symbol_type (int token, const int&, const location_type&);
+symbol_type (int token, const location_type&);
address@hidden example
+
address@hidden
+which should be used in a Flex-scanner as follows.
+
address@hidden
+%%
+[a-z]+   return yy::parser::symbol_type (TOK_IDENTIFIER, yytext, loc);
+[0-9]+   return yy::parser::symbol_type (TOK_INTEGER, text_to_int (yytext), 
loc);
+":"      return yy::parser::symbol_type (':', loc);
+<<EOF>>  return yy::parser::symbol_type (0, loc);
address@hidden example
+
address@hidden 1
+
+Note that it is possible to generate and compile type incorrect code
+(e.g. @samp{symbol_type (':', yytext, loc)}).  It will fail at run time,
+provided the assertions are enabled (i.e., @option{-DNDEBUG} was not passed
+to the compiler).  Bison supports an alternative that guarantees that type
+incorrect code will not even compile.  Indeed, it generates @emph{named
+constructors} as follows.
+
 @deftypemethod {parser} {symbol_type} address@hidden (const @var{value_type}& 
@var{value}, const location_type& @var{location})
 @deftypemethodx {parser} {symbol_type} address@hidden (const location_type& 
@var{location})
 @deftypemethodx {parser} {symbol_type} address@hidden (const @var{value_type}& 
@var{value})
@@ -11531,7 +11582,7 @@ symbol_type make_EOF (const location_type&);
 @end example
 
 @noindent
-which should be used in a Flex-scanner as follows.
+which should be used in a scanner as follows.
 
 @example
 [a-z]+   return yy::parser::make_IDENTIFIER (yytext, loc);
@@ -11544,6 +11595,7 @@ Tokens that do not have an identifier are not 
accessible: you cannot simply
 use characters such as @code{':'}, they must be declared with @code{%token},
 including the end-of-file token.
 
+
 @node A Complete C++ Example
 @subsection A Complete C++ Example
 
diff --git a/tests/types.at b/tests/types.at
index bead23d0..e41c21b1 100644
--- a/tests/types.at
+++ b/tests/types.at
@@ -288,6 +288,24 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc], 
[glr.cc]],
                AT_VAL.build (std::pair<std::string, std::string> ("two", 
"deux"));],
             [10:11, two:deux])
 
+    # Type-based token constructors on move-only types, and types with commas.
+    AT_TEST([%skeleton "]b4_skel["
+             %define api.value.type variant
+             %define api.token.constructor],
+            [[%token <std::pair<int, int>> '1' '2';]],
+            ['1' '2'
+              {
+                std::cout << $1.first << ':' << $1.second << ", "
+                          << $2.first << ':' << $2.second << '\n';
+              }],
+            ["12"],
+            [[typedef yy::parser::symbol_type symbol;
+             if (res)
+               return symbol (res, std::make_pair (res - '0', res - '0' + 1));
+             else
+               return symbol (res)]],
+            [1:2, 2:3])
+
     # Move-only types, and variadic emplace.
     AT_TEST([%skeleton "]b4_skel["
              %code requires { #include <memory> }
@@ -336,11 +354,11 @@ m4_foreach([b4_skel], [[yacc.c], [glr.c], [lalr1.cc], 
[glr.cc]],
                                  << $2.first << ':' << $2.second << '\n'; }],
             ["12"],
             [[if (res == '1')
-               return yy::parser::make_symbol ('1', std::make_unique<int> 
(10));
+               return {res, std::make_unique<int> (10)};
              else if (res == '2')
-               return yy::parser::make_symbol ('2', std::make_pair (21, 22));
+               return {res, std::make_pair (21, 22)};
              else
-               return yy::parser::make_symbol (0)]],
+               return res]],
             [10, 21:22],
             [AT_REQUIRE_CXX_STD(14, [echo "$at_std not supported"; continue])])
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]