[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU Bison: D language support

From: Akim Demaille
Subject: Re: GNU Bison: D language support
Date: Fri, 8 Feb 2019 06:10:28 +0100

[HS and I agreed to move to public lists now.  TS, bison-patches is
the right place to discuss code changes.  Please, resend your answer
to the predecessor of this message as a reply to this message.  TIA!]


Hi HS!

(Is this the proper way to call you?)

All the comments should be public on bison's lists.  Are you ok
that I repost this message there?

> Le 7 févr. 2019 à 16:51, H. S. Teoh <address@hidden> a écrit :
> On Tue, Feb 05, 2019 at 07:15:57AM +0100, Akim Demaille wrote:
> [...]
>>> Le 4 févr. 2019 à 22:21, H. S. Teoh <address@hidden> a écrit :
> [...]
>>> Recently Eduard Staniloiu posted a thread in the D forum requesting
>>> for help with improving said D backend. I posted a somewhat lengthy
>>> reply, but did not receive any reply, so I thought I should contact
>>> you directly instead.
>> Doh...  So he did not ask you to post on the Bison's list :(
>> I'll look for your answer.
> Hi,
> I don't know if you've found my post yet,

Didn't look for it yet...  Sorry, Bison is only a hobby, and the
time I can devote to it is quite irregular...

> but to save you the trouble, here it is in full:

That's very kind, thanks!

>> On Tue, Jan 15, 2019 at 03:13:44PM +0000, Eduard Staniloiu via Digitalmars-d 
>> wrote:
>> [...]
>> I glanced briefly at the various D-related notes, and took a good look
>> at the generated calc.d in the examples/d directory.  Here are some
>> comments:
>> - I understand that the current D codegen is mainly based on the
>> existing Java backend, so unsurprisingly quite a few places shows
>> signs of being very Java-like rather than D-like.  Hopefully, with
>> some work, we can get it to emit more idiomatic D. :-)

I had the same feeling!  Let's work on this.

>> - The first question I have is how much the Bison API depends on the
>> lexer being swappable at runtime, i.e., via the Lexer interface.
>> I'm having a hard time imagining that there will be many use cases
>> where you'd like to swap lexers with the same parser at runtime, so
>> I'm thinking the parser should simply take the lexer type as a
>> template argument, with sig constraints ensuring that whatever type
>> the user passes in implements the necessary methods for the parser
>> to work.  This lets us bind the lexer to the parser at compile-time,
>> and elide the vtable indirection (it can still be done if the user
>> passes in a class).

Makes sense to me.

>> - Along a similar vein, I'm wondering if the generated parser ought to
>> be a class at all, or is the inheritability of the parser a key
>> Bison feature?  Also, are language-specific directives supported /
>> encouraged?  If so, it might be worthwhile to let the user choose
>> whether to use a struct/template API vs. an OO class-based API.

I guess I don't know what you call a class here (as you certainly
know class and struct are roughly equivalent in C++), but I read
your comment as "class should be reserved to participants of

Well, I made no efforts to have the C++ parser derives from some
base class.  Yet, there are a couple of "virtual" in the API, but
let's consider them historical artifacts.  I don't think it makes
much sense to have a hierarchy of parsers.

>> - On a more high-level note, I'm wondering how flexible the API of the
>> parser can be.  The main thought behind this is that given enough
>> flexibility, we may be able to target, e.g., @nogc, @safe, pure,
>> etc..  With @safe probably a pretty important target, if it's
>> possible to do so.  While this depends of course on the exact code
>> the user puts into the .y file, a worthy goal is to make the emitted
>> D code @safe (pure, etc.) by default unless the user writes
>> address@hidden code in the .y file.

I cannot comment on this.  But the generated parser should aim at
the least constrains.  So the generated code itself should not
require a GC, IMHO.

>> - How flexible can the lexer API be?  For example, currently
>> lexer.yyerror takes a string argument, which requires using
>> std.format in various places.  If permissible, I'd like to have
>> yyerror take a generic input range instead, so that we can avoid the
>> inherent memory allocation of std.format (e.g., if we wish to target
>> @nogc).

lever.yyerror?  yyerror is expected to be part of the parser,
not the scanner.

It's painful that the interface of yyerror has to be declared
by the user in C and C++, but, again, that's rather an historical
scar.  You should aim at a fixed signature.

>> - Also, is it possible to use exceptions instead of yyerror()?  Or
>> would that deviate too far from Bison's design?

That would be a misunderstanding of the purpose of yyerror.
This function is called when there's a syntax error, and the
error message must be passed to the user.  The implementation
of yyerror then decides whether to print to stderr, syslog it,
open a GUI, or raise an exception, why not.  But still, that
would be a waste, because the parser may be able to recover
from error, and gather even more error messages, which maybe
delivered eventually to the user after the parse was finished.

>> - On a more general note, I'd like to make the parser/lexer APIs
>> range-based as much as possible, esp. when it comes to
>> string-handling.  But I'm just not sure how much the APIs are
>> expected to conform to the analogous C/C++/Java APIs.

Because in practice the maintenance falls on the shoulders of
the Bison maintainers, we want to API to remains as alike as
possible, without being unnatural to the host language.

>> - I wonder if YYSemanticType could use std.variant somehow instead of
>> a raw union, which would probably force the parser to be @system.

A union is the natural storage: the parser knows the type of
the current value, it does not need variants that duplicate
this knowledge of the current type.

>> - Can Bison handle UTF-8 lexer/parser rules?  D uses UTF-8 by default,
>> and it would be nice to leverage this support instead of manually
>> iterating over bytes, as is done in a few places.

Bison does not care about your encoding, it sits on top of a
stream of tokens, not a stream of characters.  Again, because
of history, it accepts bytes as tokens-of-the-poor, but it should
not learn to read UTF-8, that's not its business.

>> - Some minor points that should be easy to fix:
>>  - The YYACCEPT, YYABORT, etc., symbols really should be declared as
>>    enums rather than static ints.


>>  - D does support the #line directive.  So these should be emitted
>>    as they are in C/C++. (I noticed they currently only appear as
>>    comments.)

Sure.  This could easily be your first contribution :)  It does not
require paperwork.

>>  - YYStack needs to be fixed to avoid the reallocate-on-every-push
>>    problem on arrays. A common beginner's mistake.  Also, if we're
>>    going to target @nogc (not 100% sure about that right now), we
>>    may have to forego built-in arrays altogether.


>>> Akim is going to provide assistance with the process, but he is not
>>> to be expected to carry this task on his own.
>> [...]
>> Dumb question: If I wanted to contribute some commits, do I have to
>> sign up on <>?  What's the 
>> procedure for submitting pull
>> requests?  (Sorry, I glanced over the README's and the FAQ at
>> <> but didn't find a clear answer.)

No, you don't need an account on Savannah.

> Thanks for the info!  I looked over the form, and it seems rather
> specific to a changeset.  Does it need to be filled out for every change
> submitted, or only for the first time?

Only first time.  :)

>> Submit your patches on address@hidden <mailto:address@hidden>.  There, we 
>> will discuss
>> style issues, etc. to improve the patch, then I'll install it.  You
>> may also fork my 
>> <> (which will help with
>> the CI too), but the patches must be posted on bison-patches.
> [...]
> So you do not use Github's pull request system?  Or do you use that in
> conjunction with the bison-patches mailing list?

GitHub is not Free Software, so "of course" we will never require it.
And using the mailing lists is a long tradition at GNU.  So please,
submit your patches there, for them to be discussed.  But feel free
to fork my repo and submit PRs there, at least to get CI.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]