bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding syste


From: Robert Pluim
Subject: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system
Date: Thu, 12 Jan 2023 14:44:29 +0100

>>>>> On Thu, 12 Jan 2023 14:32:52 +0200, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Actually, the doc string is clear:

    Eli>   If the value is a cons cell, on decoding, check the first two bytes.
    Eli>   If they are 0xFE 0xFF, use the car part coding system of the value.
    Eli>   If they are 0xFF 0xFE, use the cdr part coding system of the value.
    Eli>   Otherwise, treat them as bytes for a normal character.  On encoding,
    Eli>   produce BOM bytes according to the value of ‘:endian’.

    Eli> Note the last sentence: it should unconditionally produce the BOM on
    Eli> encoding.  Which is what we do in your scenario.

Ah, I misread that as "depending on the value of ':endian'"

One minor nit, the description for ':endian' says:

    `:endian'

    VALUE must be `big' or `little' specifying big-endian and
    little-endian respectively.  The default value is `big'.

    This attribute is meaningful only when `:coding-type' is `utf-16'.

That last sentence seems untrue, as ':endian' is meaningful for
'utf-8-auto'

    >> (Iʼm willing to be told that buffer-file-coding-system shouldnʼt be
    >> 'utf-8-auto, but I never set that explicitly as far as I know 😀)

    Eli> Who does set utf-8-auto? where did you originally bump into this?
    Eli> This is an obscure coding-system, and the fix to make it work as
    Eli> documented will produce an incompatible change in behavior.  So before
    Eli> I decide whether to make the change and on what branch, I'd like to
    Eli> know how in the world did you encounter this.

Itʼs entirely my own fault:

The file where I noticed this is shared between a GNU/Linux and a
macOS machine, which means I foolishly added the following a year ago,
even though itʼs unnecessary (perhaps I was thinking I was going to be
sharing it with a Windows machine?):

    ;; -*- lexical-binding: t; coding: utf-8-auto; -*-

I think that means we can leave the code as it is.

Robert
-- 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]