[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding syste
From: |
Robert Pluim |
Subject: |
bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system |
Date: |
Thu, 12 Jan 2023 14:44:29 +0100 |
>>>>> On Thu, 12 Jan 2023 14:32:52 +0200, Eli Zaretskii <eliz@gnu.org> said:
Eli> Actually, the doc string is clear:
Eli> If the value is a cons cell, on decoding, check the first two bytes.
Eli> If they are 0xFE 0xFF, use the car part coding system of the value.
Eli> If they are 0xFF 0xFE, use the cdr part coding system of the value.
Eli> Otherwise, treat them as bytes for a normal character. On encoding,
Eli> produce BOM bytes according to the value of ‘:endian’.
Eli> Note the last sentence: it should unconditionally produce the BOM on
Eli> encoding. Which is what we do in your scenario.
Ah, I misread that as "depending on the value of ':endian'"
One minor nit, the description for ':endian' says:
`:endian'
VALUE must be `big' or `little' specifying big-endian and
little-endian respectively. The default value is `big'.
This attribute is meaningful only when `:coding-type' is `utf-16'.
That last sentence seems untrue, as ':endian' is meaningful for
'utf-8-auto'
>> (Iʼm willing to be told that buffer-file-coding-system shouldnʼt be
>> 'utf-8-auto, but I never set that explicitly as far as I know 😀)
Eli> Who does set utf-8-auto? where did you originally bump into this?
Eli> This is an obscure coding-system, and the fix to make it work as
Eli> documented will produce an incompatible change in behavior. So before
Eli> I decide whether to make the change and on what branch, I'd like to
Eli> know how in the world did you encounter this.
Itʼs entirely my own fault:
The file where I noticed this is shared between a GNU/Linux and a
macOS machine, which means I foolishly added the following a year ago,
even though itʼs unnecessary (perhaps I was thinking I was going to be
sharing it with a Windows machine?):
;; -*- lexical-binding: t; coding: utf-8-auto; -*-
I think that means we can leave the code as it is.
Robert
--