bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60655: 30.0.50; tree-sitter: `treesit-transpose-sexps' is broken.


From: Mickey Petersen
Subject: bug#60655: 30.0.50; tree-sitter: `treesit-transpose-sexps' is broken.
Date: Mon, 09 Jan 2023 12:30:11 +0000
User-agent: mu4e @VERSION@; emacs 30.0.50

Theodor Thornhill <theo@thornhill.no> writes:

> Mickey Petersen <mickey@masteringemacs.org> writes:
>
>> Theodor Thornhill <theo@thornhill.no> writes:
>>
>>> Mickey Petersen <mickey@masteringemacs.org> > The tree-sitter-enabled
>>> function, `treesit-transpose-sexps', that is called by
>>> transpose-sexps, is broken.
>>>>
>>>> It uses a naive method of sibling adjacency to determine
>>>> transpositions. But it is unfortunately not correct.
>>>>
>>>> Python:
>>>>
>>>>
>>>>   def -!-foo():
>>>>       pass
>>>>
>>>> Turns into this with `C-M-t':
>>>>
>>>>   def ()foo:
>>>>       pass
>>>>
>>>> But it ought to be:
>>>>
>>>>   foo def():
>>>>       pass
>>>>
>>>>
>>>> It's swapping two siblings that are indeed adjacent in the tree, but
>>>> not on screen, which is confusing and a regression from its previous
>>>> behaviour.
>>>>
>>>
>>> I can try to make transpose-sexps rely on only swapping "allowed"
>>> node-types?  That would be able to keep the new, better function, yet
>>> still disallow these syntax-breaking transposes.  What do you think?
>>>
>>
>> This is a hard problem. I'm building the self-same in Combobulate, so
>> when I saw this implementation I saw a well-trodden path by myself.
>> There's a lot of subtlety to it, and it is not immediately possible to
>> accurately gauge the right things to swap with simple (or not so
>> simple) sibling transpositions.
>>
>> Using a defined list is better, but with the caveat that it requires manual
>> intervention per mode. This is a really tricky thing to build well.
>
> Yeah, but I guess that is a sensible change.  It isn't easy, no, so I'm
> open for suggestions and improvements.  IMO an improvement would be to
> increase the likelihood that a transpose-sexps will still be valid code.
> I don't really think it is useful to do things like "def foo() -> foo
> def()" because that is nonsensical code, and is covered by
> transpose-words anyway.  To me a _more_ sensible approach here would be
> to transpose the defun at point with the next one, as they are usually
> interchangeable.  I am looking into such an improvement, and have been
> for a while.
>

That, in my opinion, is the wrong way to look at it.

`C-M-t' already works well: it transposes stuff around point. Nothing
more, nothing less.

If I write rubbish code as a human, no amount of machine intelligence
will (yet) undo that. Nor should a 'clever' mechanism that is only
clever by half. Trying to transpose things on, near or around
point is a useful addition if, and only if, it can do so in a manner
that is sensible, and predictable, to its operator.

You will very quickly run into umpteen problems generalizing this.
That's why I have shown restraint and limited Combobulate to things
that I feel are simple (but made it quite easy for someone to
customize, if they disagree!)

As a user, I may well want to put my code into an erroneous state,
temporarily, because I am doing something that cannot be represented
atomically as a single command. Therefore, `self-insert-command' (for
example) does not predict what I am about to type and intercedes when
it disagrees with me: it merely abides.

When I do this with `C-M-t' it is because it is an intentional act on
my behalf. The example I gave above is illustrative; it's designed to
highlight the problem.

>>
>>
>>
>>>> You could make a cogent argument that both approaches are wrong from a
>>>> syntactic perspective, but I think that misses the broader point that
>>>> `C-M-t' now does something errant and unexpected.
>>>
>>> I don't really see how "foo def():" is any better at all.  We gain some
>>> great improvements with this "naive" method - namely:
>>>
>>> if 5 + 5 == 10 then 10 else 100 + 100.  If point is on the else the 100
>>> + 100 wil be swapped by 10, but the old behavior will be broken.
>>>
>>
>> The old behaviour was consistent. It had a simple *modus operandi*:
>> swap two things around point. As someone who has used `C-M-t' for
>> decades, I know what it'll do in pretty much all situations, because I
>> know what `C-M-k` and `C-M-f/b` do at all times.
>
> It may be consistent, but imo it is too close to transpose-words, and
> too likely to create useless code in non-lisp languages.
>

No, transpose word and transpose sexp are very different; do different
things; and apply in vastly different circumstances:

Let -!- be point:

    d = {'Hello, World!': -!- 1}

    # C-M-t
    d = {1: 'Hello, World!'}

    # M-t
    d = {'Hello, 1!': World}

`transpose-sexps' works just fine the way it is: enriching it with a
greater understanding of certain contexts is a fruitful endeavour, if
it is done sympathetically.

>>
>> Neither approach is great if you holistically approach this task as
>> "making it correct at all times", and it is easy to confect scenarios
>> that result in something that is semantically wrong, but syntactically
>> correct; something that is plain wrong, both semantically and
>> syntactically; and something that is occasionally correct.
>>
>
> I see what you mean, but to me semantically _and_ syntactically correct
> is the benefit we should pursue when we actually have the parse tree.
> The current implementation will semantically correct in many interesting
> cases, such as the one I outlined, and is a huge improvement to the
> current "transpose-words"-like behavior.
>

There is no such thing as "syntactically correct" if you allow a user
unfettered access to type in a buffer. Merely typing in the wrong
place will break that promise. And who are we to judge what someone
writes and where?

The resting state of all code is "almost always broken" as you're
typing out your code.

>> 'Like' siblings are an easy way out of this mess with the caveat, as
>> you'll see, but now you need to carefully pluck the right nodes from
>> the tree!
>>
>> Consider the node type `pair' in a dict in Python. They are easily 
>> transposable for
>> that very reason, notwithstanding the anonymous "," node betwixt them.
>>
>> That is why Combobulate has a list of stuff that it can safely
>> transpose, and for everything else it defaults to the "classic"
>> transpose.
>>
>
> Yeah, such an approach seems reasonable, and there is already precedence
> in defining such "things" in Emacs.  As for the default fallback, I'll
> see what I can do in the "treesit-transpose-sexps" function.  The
> machinery in transpose-subr and friends is a little finicky, so to
> adhere to that mechanism isn't the easiest thing.
>

Sure. But please be careful when you make changes to `transpose-subr'
(or `transpose-subr-1') so that they don't break its existing contract
with its current users. It is a *very* powerful set of commands that
can swap arbitrary tracts of text.

>>>>
>>>> Worse, it's not possible to revert to the old behaviour (see
>>>> bug#60654)
>>>>
>>>>
>>
>> Thanks for fixing that!
>>
>
> No problem - hopefully it is installed pretty soon.
>
> Theo






reply via email to

[Prev in Thread] Current Thread [Next in Thread]