[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: symbol_type::token () removed?

From: Akim Demaille
Subject: Re: symbol_type::token () removed?
Date: Thu, 2 Jan 2020 08:27:42 +0100

Hi Wolgang,

Finally a bug report!  3.5 was released three weeks ago, and usually
bug reports arrive very soon after the release.  I was getting worried :)

> Le 1 janv. 2020 à 23:51, Wolfgang Thaller <address@hidden> a écrit :
> Commit 8c87a623 says:
>> c++: get rid of symbol_type::token ()
>> It is not used.  And its implementation was wrong when api.token.raw
>> was defined, as it was still mapping to the external token numbers,
>> instead of the internal ones.  Besides it was provided only when
>> api.token.constructor is defined, yet always declared.
>> * data/skeletons/c++.m4 (by_type::token): Remove, useless.
> Well, I disagree. It's not useless, I'm using it,


About a year ago, I wrote to you

>> The thing is that I don't know how people are using these types: they do 
>> things that are not documented, and as such, are not considered part of the 
>> contract.

and that's another instance of that pattern.

> and I think my use case is legitimate.
> [...]
> So I have things like:
>       Lexer lex("12345");
>       symbol_type t = lex.nextToken();
>       assert(t.token() == token::INTLIT);
>       assert(t.value.as<int>() == 12345);

which gives me the impression that the real use case would be something like

assert(t == token::make_INTLIT(12345))

WDYT?  Hum.  That would require operator==, which would generate a lot
of code.  And it would require the value types to support ==, which so
far we do not require.

> ... which suddenly stopped working.
> Fortunately, I can instead do...
>       assert(t.type_get() == by_type(token::INTLIT).type_get());
> ... but that's a bit ugly.

That's because yytranslate_ is private, otherwise it could be

        assert(t.type_get() == yytranslate_(token::INTLIT));

> In general, I think symbol_type should always provide enough public API
> in order to allow inspecting it. In my case, I only need it for
> implementing tests, but it general it might be useful when writing the
> lexer, as well.

The reason I was very happy to get rid of it is that it's kind of big:
it includes a table whose size is the number of tokens.  I try to keep
our parsers thin.  Commit 8c87a623 is:

-  ]b4_inline([$1])b4_parser_class[::token_type
-  ]b4_parser_class[::by_type::token () const YY_NOEXCEPT
-  {
-    // YYTOKNUM[NUM] -- (External) token number corresponding to the
-    // (internal) symbol number NUM (which must be that of a token).  */
-    static
-    const ]b4_int_type_for([b4_toknum])[
-    yytoken_number_[] =
-    {
-  ]b4_toknum[
-    };
-    return token_type (yytoken_number_[type]);
-  }

Have you given a shot at api.token.raw?  In that case symbol_type::token()
would be really thin: it would just return 'type'.

> P.S.: Keep up the good work, bison is my favorite way of parsing things
> using C++....

Thanks a lot for saying :)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]