bug#61726: [PATCH] Eglot: Support positionEncoding capability

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61726: [PATCH] Eglot: Support positionEncoding capability

From:	Eli Zaretskii
Subject:	bug#61726: [PATCH] Eglot: Support positionEncoding capability
Date:	Fri, 24 Feb 2023 10:38:35 +0200

> From: Augusto Stoffel <arstoffel@gmail.com>
> Cc: joaotavora@gmail.com,  61726@debbugs.gnu.org
> Date: Fri, 24 Feb 2023 08:18:30 +0100
> 
> On Fri, 24 Feb 2023 at 08:43, Eli Zaretskii wrote:
> 
> > It does? then please humor me by walking me through the code and the
> > patch to show how that would work after applying the patch.
> 
> +            :general
> +            (list
> +             :positionEncodings ["utf-32" "utf-8" "utf-16"])
>              :experimental eglot--{})))

Is "UTF-32" an LSP thing and terminology?  Because I'd prefer a
different name if we can.  At least for our internal nomenclature,
let's use "codepoint" or "character" instead.

> -(defun eglot-current-column () (- (point) (line-beginning-position)))
> +(defun eglot-current-column ()
> +  "Calculate current column, counting Unicode codepoints."
> +  (- (point) (line-beginning-position)))

Can we please take this opportunity to get rid of the confusing
"column" terminology?  As became evident from this discussion, we are
not talking columns here, we are talking offsets in characters from
BOL.  So something like "pos" or "linepos" or "line-offset" should be
better.

João, are you okay with such a sweeping change in all of eglot.el?

> +(defun eglot--current-column-utf-8 ()
> +  "Calculate current column, counting bytes."
> +  (- (position-bytes (point)) (position-bytes (line-beginning-position))))

As discussed, position-bytes is incorrect.  You should instead do
something like

  (length (encode-coding-string
           (buffer-substring-no-properties (point)
                                           (line-beginning-position))
           'utf-8-unix t))

Also, for 100% reliable results, we should bind
inhibit-field-text-motion to t when calling line-beginning-position.

> +(defun eglot--move-to-column-utf-8 (column)
> +  "Move to COLUMN, regarded as a byte offset."
> +  (goto-char (min (byte-to-position
> +                   (+ (position-bytes (line-beginning-position)) column))
> +                  (line-end-position))))

Likewise here.

> @@ -1515,14 +1536,20 @@ eglot--lsp-position-to-point
>        (forward-line (min most-positive-fixnum
>                           (plist-get pos-plist :line)))
>        (unless (eobp) ;; if line was excessive leave point at eob
> -        (let ((tab-width 1)
> +        (let ((movefn (or eglot-move-to-column-function
> +                          (pcase (plist-get (eglot--capabilities 
> (eglot-current-server))
> +                                            :positionEncoding)
> +                            ("utf-32" #'eglot-move-to-column)
> +                            ("utf-8" #'eglot--move-to-column-utf-8)
> +                            (_ #'eglot-move-to-lsp-abiding-column))))
> +              (tab-width 1)
                  ^^^^^^^^^^^
This last part shouldn't be necessary: we should move by characters,
not by columns.  Why is it necessary?

> I hope this helps clarifying things.

Yes, thank you very much.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#61726: [PATCH] Eglot: Support positionEncoding capability, (continued)

Prev by Date: bug#61748: 27.2; Eglot should use shell-file-name when launching the language server for a remote file
Next by Date: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Previous by thread: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Next by thread: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Index(es):
- Date
- Thread