[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20623: XML and HTML files with encoding/charset="utf-8" declaration

From: Eli Zaretskii
Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Sat, 11 Aug 2018 19:27:33 +0300

> Date: Sat, 11 Aug 2018 17:41:01 +0200
> From: Vincent Lefevre <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden,
>       address@hidden, address@hidden
> > > You're completely wrong. The presence of BOM or not is very important
> > > for some applications, such as Firefox (not to determine the charset,
> > > but the MIME type of local files).
> > 
> > Please provide the details, including the use case, if possible.  I'm
> > still in the dark regarding the importance of the BOM in UTF-8 encoded
> > HTML stuff.
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1422889
> for HTML. Wontfix because of:
>   https://mimesniff.spec.whatwg.org/#mime-type-sniffing-algorithm
> For text/plain only (but this is another example that BOM can matter
> in practice), there's
>   https://bugzilla.mozilla.org/show_bug.cgi?id=1071816
> (which is a bug that should be fixed).

Maybe I'm missing something, but none of these issues describes the
situation in this bug report, namely: an HTML file with an explicit
charset= tag, with or without a BOM.  In fact, the first of these
issues happens only in files that _do_ have a BOM, so you could say
that Emacs did you a favor by removing it ;-)

> > I agree about the user not knowing, but that doesn't yet qualify as
> > "data loss", which has an widely accepted meaning.
> This is data corruption, which is a form of data loss, because some
> information is lost in the process (I recall that Emacs does not
> provide any information to the user about this transformation).

That is the most inclusive interpretation of "data loss" I've ever
seen.  "Some information is lost" is nowhere near what "grave bug"
means by "data loss", so I don't think "grave" applies here.

Anyway, the Emacs issue is now fixed.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]