bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 06/10] api.token.raw: document it


From: Akim Demaille
Subject: [PATCH 06/10] api.token.raw: document it
Date: Sun, 1 Sep 2019 18:41:19 +0200

* doc/bison.texi: here.
---
 NEWS           | 14 ++++++++++++++
 TODO           |  6 +++++-
 doc/bison.texi | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 05bcee8a..781f57cf 100644
--- a/NEWS
+++ b/NEWS
@@ -18,6 +18,20 @@ GNU Bison NEWS
   The C++ deterministic skeleton (lalr1.cc) now supports LAC, via the
   %define variable parse.lac.
 
+*** Variable api.token.raw: Optimized token numbers (all skeletons)
+
+  In the generated parsers, tokens have two numbers: the "external" token
+  number as returned by yylex (which starts at 257), and the "internal"
+  symbol number (which starts at 3).  Each time yylex is called, a table
+  lookup maps the external token number to the internal symbol number.
+
+  When the %define variable api.token.raw is set, tokens are assigned their
+  internal number, which saves one table lookup per token, and also saves
+  the generation of the mapping table.
+
+  The gain is typically moderate, but in extreme cases (very simple user
+  actions), a 10% improvement can be observed.
+
 *** Debug traces in Java
 
   The Java backend no longer emits code and data for parser tracing if the
diff --git a/TODO b/TODO
index 0ddd6729..f0ec27da 100644
--- a/TODO
+++ b/TODO
@@ -73,7 +73,11 @@ syntax error, unexpected $end, expecting ↦ or 🎅🐃 or '\n'
 
 
 While at it, we should stop using "$end" by default, in favor of "end of
-file", or "end of input", whatever.
+file", or "end of input", whatever.  See how lalr1.java does that.
+
+** api.token.raw
+Maybe we should exhibit the YYUNDEFTOK token.  It could also be assigned a
+semantic value so that yyerror could be used to report invalid lexemes.
 
 * Bison 3.6
 ** Unit rules
diff --git a/doc/bison.texi b/doc/bison.texi
index 9b6981d3..5a171639 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -6212,6 +6212,42 @@ introduced in Bison 3.0
 @c api.token.prefix
 
 
+@c ================================================== api.token.raw
+@deffn Directive {%define api.token.raw}
+
+@itemize @bullet
+@item Language(s):
+all
+
+@item Purpose:
+The output files normally define the tokens with Yacc-compatible token
+numbers: sequential numbers starting at 257 except for single character
+tokens which stand for themselves (e.g., in ASCII, @samp{'a'} is numbered
+65).  The parser however uses symbol numbers assigned sequentially starting
+at 3.  Therefore each time the scanner returns an (external) token number,
+it must be mapped to the (internal) symbol number.
+
+When @code{api.token.raw} is set, tokens are assigned their internal number,
+which saves one table lookup per token to map them from the external to the
+internal number, and also saves the generation of the mapping table.  The
+gain is typically moderate, but in extreme cases (very simple user actions),
+a 10% improvement can be observed.
+
+When @code{api.token.raw} is set, the grammar cannot use character literals
+(such as @samp{'a'}).
+
+@item Accepted Values: Boolean.
+
+@item Default Value:
+@code{false}
+@item History:
+introduced in Bison 3.5.  Was initialy introduced in Bison 1.25 as
+@samp{%raw}, but never worked and was removed in Bison 1.29.
+@end itemize
+@end deffn
+@c api.token.raw
+
+
 @c ================================================== api.value.automove
 @deffn Directive {%define api.value.automove}
 
-- 
2.23.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]