bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20623: XML and HTML files with encoding/charset="utf-8" declaration


From: Eli Zaretskii
Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Thu, 21 May 2015 22:48:31 +0300

> Date: Thu, 21 May 2015 20:50:58 +0200
> From: Simon Ledergerber <sledergerber@gmx.net>
> 
> When I was editing XHTML and HTML files, I wanted to make sure the BOM 
> was written out to the file in order to make it easier for the browser 
> to detect the UTF-8 encoding. Therefore I changed the coding system for 
> the file buffer to utf-8-with-signature-dos (since I am working on a 
> Windows System) before saving the file.
> 
> After some time I got surprised because the browser (IE11), didn't 
> report UTF-8 as the file's encoding. Having checked the hexdump of my 
> (X)HTML file, I saw the BOM was definitely missing.
> 
> Obviously, when a "UTF-8" string appears in the <meta charset="utf-8"> 
> (even if commented out, see later below) or <?xml version="1.0" 
> encoding="utf-8"?> declaration, Emacs switches the file coding system to 
> utf-8, when it saves the file, even if utf-8-with-signature was 
> specified explicitly before. This appears to me as a bug, because there 
> is no way anymore to restore the BOM using Emacs.

What would you expect Emacs to do instead?  It just obeys the stated
encoding, which says nothing about the BOM.  How can Emacs know when
to use utf-8 and when utf-8-with-signature?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]