emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] * etc/NEWS: Announce addition of BOM to utf-8-auto


From: Tom Gillespie
Subject: Re: [PATCH] * etc/NEWS: Announce addition of BOM to utf-8-auto
Date: Sun, 29 Jan 2023 14:56:11 -0500

>  Encoding with 'utf-8-auto' now correctly produces a byte order mark.

Much better.

> Maybe (you assume that people really read all the small print in
> NEWS?).  But first, could you explain why on earth are you using
> utf-8-auto _on_encoding_?  It basically makes no sense at all.

Hah, no, I don't think many people do, but maybe the maintainers
of some of the more widely used packages might?

I have no idea why they are using it on encoding. Having played
with it, it produces absolutely insane results like multiple calls
prepending multiple BOMs when the default coding system is
not itself set to utf-8-auto (or something like that).

Maybe an opportunity to add a line to the message that says
"As a reminder, there are next to no cases where utf-8-auto
should be used with 'encode-coding-' functions." or similar?

> All the people who did that with whom I talked until now did it
> because they thought the "auto" part was about the EOL format (CR-LF
> vs Newline).  Is that so in your case as well?

I personally have never touched utf-8-auto, but I'm cleaning
up existing bugs that have impacted me.

If I had to guess this issue is probably the result of people
copying what is done in async.el where there is a comment
that reads:

  ;; FIXME: Why use `utf-8-auto' instead of `utf-8-unix'?  This is
  ;; a communication channel over which we have complete control,
  ;; so we get to choose exactly which encoding and EOL we use, isn't it?

https://github.com/jwiegley/emacs-async/blob/270c3d0bd99386dd9a8538990401993a6a3cb1bc/async.el#L201-L203

Which suggests that your account of the confusion is exactly the issue.

However there is also a comment about it somehow mitigating issues
with strings that have EOFs in them?? Is this even true?

  ;; Just in case the string we're sending might contain EOF
  (encode-coding-region (point-min) (point-max) 'utf-8-auto)
https://github.com/jwiegley/emacs-async/blob/270c3d0bd99386dd9a8538990401993a6a3cb1bc/async.el#L222-L223



reply via email to

[Prev in Thread] Current Thread [Next in Thread]