[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#33567: Syntactic fontification of diff hunks

From: Eli Zaretskii
Subject: bug#33567: Syntactic fontification of diff hunks
Date: Tue, 11 Dec 2018 08:23:37 +0200

> From: Juri Linkov <address@hidden>
> Cc: address@hidden
> Date: Tue, 11 Dec 2018 02:38:18 +0200
> > If you still want to reuse the literal contents of the file, as
> > inserted by vc-git-find-revision etc., then you will have to duplicate
> > what insert-file-contents does internally.  I suggest to look at how
> > this is done in archive-set-buffer-as-visiting-file.
> I looked at archive-set-buffer-as-visiting-file, and it seems it could
> be simplified based on the assumption that the backend inserts output
> in the binary coding.

"Binary coding" means what we have in the buffer is exactly what we
had on disk.  IOW, we have there the contents of the file in its
original encoding, and so decoding it correctly needs to use the same
facilities we normally use when visiting a file with the likes of
find-file.  archive-set-buffer-as-visiting-file solves precisely the
same problem.

> I tried and this fixes the problem:
> diff --git a/lisp/vc/vc.el b/lisp/vc/vc.el
> index 5ff9f4d5be..c980369fa9 100644
> --- a/lisp/vc/vc.el
> +++ b/lisp/vc/vc.el
> @@ -2042,6 +2042,7 @@ vc-find-revision-no-save
>                     (if backend
>                         (vc-call-backend backend 'find-revision file revision 
> outbuf)
>                       (vc-call find-revision file revision outbuf))))
> +                  (recode-region (point-min) (point-max) 
> buffer-file-coding-system 'binary)

Where does the value of buffer-file-coding-system come from here?
Isn't that just (default-value 'buffer-file-coding-system)?  If so,
you were just lucky that it worked for you; in general, if the
encoding of the file is different from your locale-derived defaults,
the above won't DTRT.

In any case, really don't think recode-region is TRT in this case, for
several reasons:

 . recode-region is a command, so it wastes cycles checking the
   argument coding-system, which is entirely unnecessary in this case
 . it narrows the buffer, something that again is a waste of cycles
   for your case
 . it wastes even more cycles for "encoding" with 'binary', which in
   this case is a no-op, since the text was not decoded on reading
 . it doesn't use the following facilities for determining the right
   encoding, where you use buffer-file-coding-system:
    - auto-coding-function, which is where we detect the 'coding:'
      cookies in the first line and in the local vars, and use the
      data in auto-coding-alist and auto-coding-regexp-alist, and also
      call auto-coding-functions if needed
    - find-operation-coding-system by file name, which uses the data
      in file-coding-system-alist to determine the appropriate
      encoding given the file's name

The hard problem here is to determine what coding-system to use for
decoding a region that was inserted without any conversions; once the
encoding is determined, the rest boils down to calling
decode-coding-region with that encoding.  The method used by
archive-set-buffer-as-visiting-file solves that very problem, whereas
recode-region does not, because it is a command that relies on the
caller to specify the encoding, and is otherwise nothing more than a
thin wrapper around decode-coding-region.

If you need further help understanding what
archive-set-buffer-as-visiting-file does, and which parts might not be
relevant to the function you are writing, please ask specific
questions and I will gladly help as much as I can.  But recode-region
is IMO not the right tool for this job.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]