[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Typescript grammar in Bison
From: |
Simon Richter |
Subject: |
Re: Typescript grammar in Bison |
Date: |
Wed, 23 Mar 2022 09:33:17 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 |
Hi,
On 3/22/22 9:24 PM, Ricard Gascons wrote:
I've been a Bison user for some time now, I've been writing some toy
projects here and there. I was wondering if there are online resources I
could find existing grammar from well-known programming languages? Just the
definitions, not the implementations of course.
More specifically, I'm looking for an existing Typescript grammar I could
use for a personal project. I guess Javascript would work too. I've looked
online and haven't had any luck so far.
I have a beginning of one that accepts the official ECMAScript testsuite
at least, but there is quite a lot missing both in the grammar and in
the testsuite.
The main problem with ECMAScript is that in principle it would be
possible to build an LALR parser for it, but there is no way to express
this sensibly in Bison, as the definition uses "negative tokens" and
last-resort parsing extensively.
The "continue", "break", "throw", "return" and "yield" statements have
an optional expression that can follow, but a newline is not permitted
at this point, likewise newlines are not permitted before the "=>" token
in an arrow-function.
This works together with automatic semicolon insertion to accept
programs with missing semicolons -- the rule is that if a token is
unexpected, but the parse would be valid if a semicolon was inserted
before the current token, then the parser should pretend that the
semicolon was there.
That can be expressed in some, but not all cases, e.g.:
lexical_declaration:
"let" binding_list ';'
| "let" binding_list
works for me, and I can generate a warning "inserted semicolon" from the
action of the second list, but this technique doesn't work everywhere a
semicolon is expected, because that introduces conflicts that are
technically in another rule, so precedence cannot resolve these.
Feel free to use the attached lexer and parser as a basis. The "NOLT"
comment indicates places where no line terminator is allowed, this is
currently not enforced.
Simon
es.ll.xz
Description: application/xz
es.yy.xz
Description: application/xz
OpenPGP_signature
Description: OpenPGP digital signature