[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dynamic token kinds

From: Frank Heckenbach
Subject: Re: Dynamic token kinds
Date: Mon, 17 Dec 2018 11:17:59 +0100

Hans Åberg wrote:

> > On 17 Dec 2018, at 10:48, Frank Heckenbach <address@hidden> wrote:
> > 
> > I think we agree here, and that was actually my concern when I
> > started this thread. I don't want to have to write a separate case
> > for each token kind in my lexer. Of course, we need a separate case
> > for each semantic type because that involves a different type in the
> > constructor/builder call already, but these are relatively few,
> > compared to token kinds, in my lexers.
> Might Bison generate a function with a switch statement, generate the right 
> return for the lexer to use?

Different semantic types need separate functions since C++ is
strongly typed. Perhaps an example makes it clearer:

Say we have tokens V_FOO and V_BAR with no semantic type, I_BAZ and
I_QUX with semantic type int and S_BLA with type string. (BTW, I'm
no fan of Hungarian notation, just use it here for the sake of
example.) So far Bison generates (roughly speaking):

  symbol_type make_V_FOO ();
  symbol_type make_V_BAR ();
  symbol_type make_I_BAZ (int &&);
  symbol_type make_I_QUX (int &&);
  symbol_type make_S_BLA (string &&);

What I suggest to add (without changing the above), is:

  symbol_type make_symbol (token_type type);
  // checks at runtime that type is V_FOO or V_BAR

  symbol_type make_symbol (token_type type, int &&);
  // checks at runtime that type is I_BAZ or I_QUX

  symbol_type make_symbol (token_type type, string &&);
  // checks at runtime that type is S_BLA

These runtime checks might be implemented via a switch if that's
easier to auto-generate (it might be in fact) or with a simple
"if (... || ...)" statement, that's an implementation detail.

> >> Maybe an option. Akim perhaps haven't used this dynamic token
> >> lookup.
> > 
> > I guess he hasn't. But I don't think we need an option. These would
> > just be additional functions that one can use or not.
> The with an option would be that those that do not need this feature could 
> use a more optimal variant.

According to my proposal everyone could use any function. In fact,
my lexers do, they use the "safe" make_FOO functions by default, and
the (so far) unchecked ones for the dynamicalled looked-up tokens.

> >> Those that do might prefer not risking the program to bomb.
> > 
> > It's not that bad actually. Again, my lexers work fine as is.
> > I just brought this up because Akim proposed to call the function
> > "unsafe_..." which I thought was too harsh and proposed
> > "unchecked_..." -- but by adding the checks, it would be neither
> > unsafe nor unchecked. :)
> This worries me.

That's why I suggest to add the check. :)

> But also having having to use something more complex to be returned by the 
> lexer than a value on the lookup table .

The lexer returns a token which contains the token kind (an enum)
and the semantic value (a union value). As mismatch is bad. The
make_FOO functions avoid a mismatch and are suitable for statically
known token kinds. The direct constructor call can be used for
dynamic token kinds, but allows a mismatch. The functions I propose
to generate instead could be used for dynamic token kinds and avoid
a mismatch.

Everything clear now?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]