emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SMIE implementation for the C-like languages


From: Arthur Evstifeev
Subject: Re: SMIE implementation for the C-like languages
Date: Tue, 10 Nov 2015 16:25:20 +1300

Stefan Monnier writes:

Thanks for the answers and sorry for the lack of examples in my first post.

>> Language itself is close to the family of C-like languages
>> with some differences to the language constructions.  I'm looking for
>> some advice about applying smie to the languages that use braces as a
>> terminators for the code blocks.
>
> Indeed, SMIE is not great for that currently.
>
> I have an "smc-mode" (i.e. SMIE-based c-mode) here which I wrote as an
> exercise to try and see what it takes to get SMIE working acceptably for
> the C language syntax.  It's not usable (it was really meant as an
> experimental prototype/proof-of-concept), but if you're interested to
> look at it, I could make it available somewhere.
>

If it's not hard to do, I'll appreciate that.

>> 1. As stated in documentation tokens that are defined in syntax table
>> don't have to be tokenised in lexer. I tried to go this way, but
>> encountered situations where defined grammars are not respected.
>
> Not sure which situations you're referring to.
>

For example for such grammar:

(id)
(inst ("if" exp "{" insts "}")
      (exp))
(insts (insts ";" insts) (inst))
(exp (exp "." id)
     (id ":" exp)
     (exp "=" exp))
(exps (exps "," exps) (exp))

When in trying to indent such code:

if true {
    |bar
}

Token "bar" is positioned incorrectly and I see such requests from smie
and lexer and indentation rules:

forward: 15 -> 18 = bar
backward: 15 -> 10 =
forward: 9 -> 9 =
backward: 9 -> 4 = true
backward: 4 -> 1 = if
forward: 9 -> 9 =
:after '{'; sibling-p:nil parent:(nil 4 if) hanging:t == nil
forward: 9 -> 9 =  [2 times]
backward: 9 -> 4 = true
backward: 4 -> 1 = if
forward: 9 -> 9 =
:before '{'; sibling-p:nil parent:(nil 4 if) hanging:t == nil
forward: 9 -> 9 =
backward: 9 -> 4 = true [2 times]
backward: 4 -> 1 = if [3 times]
:list-intro 'if'; sibling-p:nil parent:nil hanging:nil == nil
forward: 4 -> 8 = true
:elem 'args'; sibling-p:nil parent:nil hanging:nil == nil
forward: 4 -> 8 = true
:elem 'basic'; sibling-p:nil parent:nil hanging:nil == 4
forward: 9 -> 9 =
:elem 'basic'; sibling-p:nil parent:nil hanging:t == 4

This logging output and indentation requests don't seem to be respecting
defined grammar.

>> It seems that smie only tries to indent closer token with respect to
>> the opener, rather than parent token defined by grammar.
>
> By default, yes.  Of course, the smie-rule-function is there to tweak
> that as/when needed.
>
>> cases, but I encountered issues with paren blinking: in some
>> situations blinking fails with "Mismatched parenthesis".
>
> Same as before: if you don't give an example, it's hard to know what
> might be the cause.

For the previous case, if we change lexer to tokenize braces and try to
indent the same construction, indentation will be correct and smie
output will be more inline with defined grammar:

forward: 22 -> 25 = bar
backward: 22 -> 9 = {
forward: 9 -> 10 = {
backward: 10 -> 9 = {
backward: 9 -> 4 = true
backward: 4 -> 1 = if
forward: 9 -> 10 = {
:after '{'; sibling-p:nil parent:(nil 1 if) hanging:t == nil
forward: 9 -> 10 = {
backward: 9 -> 4 = true
backward: 4 -> 1 = if
forward: 9 -> 10 = {
:before '{'; sibling-p:nil parent:(nil 1 if) hanging:t == nil
forward: 9 -> 10 = {
:elem 'basic'; sibling-p:nil parent:nil hanging:t == 4

But blink-matching-open calls for the simple code block:

{
}|

will return "Mismatched parenthesis" error.

>
>> During some tests I decided to change lexer rules for braces to return
>> begin/end tokens instead of braces. I noticed that smie still tries to
>> indent "}" token in some situations, specifically `:close-all . "}"`.
>
> At this point, I'm very confused, because I don't know what your code
> does when, nor when you see which behavior.
>

If we modify lexer from the last case, so instead of braces we will
return begin/end tokens, something like this:

((looking-at "{") (forward-char 1) "begin")
((looking-at "}") (forward-char 1) "begin")

And change grammar accordingly then try to indent this:

if true {
    bar
|}

smie output will still contain request for indentation of "}" even if
lexer didn't return such token:

forward: 19 -> 20 = end
backward: 20 -> 19 = end
backward: 19 -> 18 = ;
backward: 18 -> 15 = bar
backward: 15 -> 9 = begin
backward: 9 -> 4 = true
backward: 4 -> 1 = if
:close-all '}'; sibling-p:t parent:(nil 1 if) hanging:nil == nil
backward: 20 -> 19 = end
backward: 19 -> 18 = ;
backward: 18 -> 15 = bar
backward: 15 -> 9 = begin
backward: 9 -> 4 = true
backward: 4 -> 1 = if

>> So my question is what will be the semantically correct way of
>> handling braces for the C-like languages?
>
> Where?  In the lexer, the grammar, or the indentation rules?
> Note that even if you answer this question, there's no single right
> answer: you largely get to decide and pick between different consequences.
>

Ok, with other answers this makes sense to me.

>> And secondary question is it expected that smie tries to indent tokens
>> that are not returned by lexer?
>
> If your tokenizer returns nil (or "") and you're in front of a paren,
> then SMIE will take this paren to be the next token, yes.
>

This question was about situation when lexer tokenizes such tokens
instead of relying on smie like in the last example.

>> 2. As a sort of continuation of the previous problem, we are having
>> problem understanding what will be semantically correct way of defining
>> `sexp` for the smie based mode. At the moment we see a different
>> behavior between non-smie c++ mode (which is close to the Swift)
>> and something like ruby-mode. One of the contributers summarised
>> differences in this post
>> https://github.com/chrisbarrett/swift-mode/pull/117#issuecomment-154753070.
>> I personally think grammar based sexp provided by smie are extremely
>> useful, but they yield confusing results when it comes to blinking
>> parens. For example grammar for "if" from here:
>> https://github.com/chrisbarrett/swift-mode/blob/simplify_smie/swift-mode.el#L74-L129
>
> Does Swift allow a "{ ... }" block to appear on its own (rather than
> as part of a while/if/...), like in C?
>
> If it does (and you want swift-mode to support those blocks), then
> I think your approach to use rules like
>
>    ("while" exp "{" insts "}")
>
> will be very problematic because SMIE's parser when parsing backward
> (which is the more usual direction) won't know whether to stop after
> skipping a {...} or whether to keep going on the off-chance that this
> {...} is really part of a for/while/if...
> [ The specific complaint you should get is that "{" will appear both as
>   an "opener" and as a "neither" (aka "inner") token.  ]
>

We came to the same conclusion recently and will try to alter grammar to
remove this ambiguity.

>> works well for indentation and movements, but blinking on the close
>> ("}") returns "if" token.
>
> Indeed, if these two notions of "sexp" need to be different, then you should
> probably disable SMIE's builtin paren-blinking support.

I tried to disable post-self-insert-hook post smie-stup same way as
octave-mode does:

(remove-hook 'post-self-insert-hook #'smie-blink-matching-open 'local)

But it doesn't change behavior of the blinking: for the same if
construction blinking happens for the "if" token. Is there a different
way of altering this behavior?

>
>
>         Stefan

Thank you,
Arthur

--
Sent with my mu4e



reply via email to

[Prev in Thread] Current Thread [Next in Thread]