help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Typescript grammar in Bison


From: Simon Richter
Subject: Re: Typescript grammar in Bison
Date: Wed, 23 Mar 2022 09:33:17 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0

Hi,

On 3/22/22 9:24 PM, Ricard Gascons wrote:

I've been a Bison user for some time now, I've been writing some toy
projects here and there. I was wondering if there are online resources I
could find existing grammar from well-known programming languages? Just the
definitions, not the implementations of course.

More specifically, I'm looking for an existing Typescript grammar I could
use for a personal project. I guess Javascript would work too. I've looked
online and haven't had any luck so far.

I have a beginning of one that accepts the official ECMAScript testsuite at least, but there is quite a lot missing both in the grammar and in the testsuite.

The main problem with ECMAScript is that in principle it would be possible to build an LALR parser for it, but there is no way to express this sensibly in Bison, as the definition uses "negative tokens" and last-resort parsing extensively.

The "continue", "break", "throw", "return" and "yield" statements have an optional expression that can follow, but a newline is not permitted at this point, likewise newlines are not permitted before the "=>" token in an arrow-function.

This works together with automatic semicolon insertion to accept programs with missing semicolons -- the rule is that if a token is unexpected, but the parse would be valid if a semicolon was inserted before the current token, then the parser should pretend that the semicolon was there.

That can be expressed in some, but not all cases, e.g.:

    lexical_declaration:
        "let" binding_list ';'
    |   "let" binding_list

works for me, and I can generate a warning "inserted semicolon" from the action of the second list, but this technique doesn't work everywhere a semicolon is expected, because that introduces conflicts that are technically in another rule, so precedence cannot resolve these.

Feel free to use the attached lexer and parser as a basis. The "NOLT" comment indicates places where no line terminator is allowed, this is currently not enforced.

   Simon

Attachment: es.ll.xz
Description: application/xz

Attachment: es.yy.xz
Description: application/xz

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]