emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61005: closed (28.1.91; Encoding not detected in HTML files inside a


From: GNU bug Tracking System
Subject: bug#61005: closed (28.1.91; Encoding not detected in HTML files inside archives)
Date: Sun, 22 Jan 2023 14:11:01 +0000

Your message dated Sun, 22 Jan 2023 16:09:47 +0200
with message-id <83leluk6dw.fsf@gnu.org>
and subject line Re: bug#61005: 28.1.91; Encoding not detected in HTML files 
inside archives
has caused the debbugs.gnu.org bug report #61005,
regarding 28.1.91; Encoding not detected in HTML files inside archives
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
61005: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=61005
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: 28.1.91; Encoding not detected in HTML files inside archives Date: Sun, 22 Jan 2023 14:13:50 +0100
Problem
----

* Given an HTML file with charset "windows-1255". 

* Opening the file from disk detects the encoding correctly.

* Opening a ZIP archive with the same file inside and than opening the
  HTML archive member does not detect the encoding, instead the coding
  system for saving is the default according to M-x
  describe-coding-system.

Attached are two files test.html and test.zip.  Call "emacs -Q test.html
test.zip" and press RET on the archive member to reproduce.

שָׁלוֹם

Attachment: test.zip
Description: Zip archive

Solution
----

The problem seems to be the function
sgml-html-meta-auto-coding-function.  It is missing a condition similar
to the one added to code in sgml-xml-auto-coding-function with commit
#df7ed10e in 2018.

modified   lisp/international/mule.el
@@ -2539,6 +2539,10 @@ sgml-html-meta-auto-coding-function
                   (bfcs-type
                    (coding-system-type buffer-file-coding-system)))
               (if (and enable-multibyte-characters
+                       ;; 'charset' will signal an error in
+                       ;; coding-system-equal, since it isn't a
+                       ;; coding-system.  So test that up front.
+                       (not (equal sym-type 'charset))
                        (coding-system-equal 'utf-8 sym-type)
                        (coding-system-equal 'utf-8 bfcs-type))
                   buffer-file-coding-system

I will send this as a patch as soon as I have a bug number to mention in
the commit message.

----

In GNU Emacs 28.1.91 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.24, cairo 
version 1.16.0)
 of 2022-08-29 built on arrian
Repository revision: f4168b8143008b787a11366462c928d761e90dd0
Repository branch: emacs-28
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Debian GNU/Linux 11 (bullseye)

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBXML2 M17N_FLT MODULES NOTIFY INOTIFY
PDUMPER PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE
XIM XPM GTK3 ZLIB

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Dired by date

Minor modes in effect:
  shell-dirtrack-mode: t
  desktop-save-mode: t
  display-time-mode: t
  xclip-mode: t
  xterm-mouse-mode: t
  delete-selection-mode: t
  cua-mode: t
  display-battery-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
~/Projects/ttf-mode/arc-mode-compat hides ~/emacs/arc-mode-compat
/home/benny/.emacs.d/elpa/transient-20210723.1601/transient hides 
/usr/local/share/emacs/28.1.91/lisp/transient
/home/benny/.emacs.d/elpa/dictionary-20201001.1727/dictionary hides 
/usr/local/share/emacs/28.1.91/lisp/net/dictionary

Features:
(shadow sort mail-extr emacsbug message rmc puny rfc822 mml mml-sec epa
epg rfc6068 epg-config gnus-util rmail rmail-loaddefs mm-decode
mm-bodies mm-encode mailabbrev gmm-utils mailheader arc-mode
archive-mode benny-images dirtrack shell pcomplete misearch
multi-isearch thai-util thai-word lao-util enriched view tabify
benny-auto-insert ttf-glyphs rng-xsd xsd-regexp rng-cmpct rng-nxml
rng-valid rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt rng-util
rng-pttrn nxml-ns nxml-mode nxml-outln nxml-rap sgml-mode facemenu dom
nxml-util nxml-enc xmltok mule-util jka-compr dired-aux time-date
bug-reference imenu desktop frameset highline benny-calendar-cfg
ange-ftp generic-x autoinsert cc-mode cc-fonts cc-guess cc-menus
cc-styles cc-align cc-cmds cc-engine cc-vars cc-defs ps-print
ps-print-loaddefs ps-def lpr advice cl-extra help-mode dired
dired-loaddefs derived benny-x-clipboard disp-table time server protbuf
xclip term/xterm xterm xt-mouse cal-china lunar solar cal-dst cal-bahai
cal-islam cal-hebrew holidays hol-loaddefs vc-git diff-mode easy-mmode
vc-dispatcher vc-fossil diary-lib diary-loaddefs cal-menu calendar
cal-loaddefs delsel grep compile text-property-search comint ansi-color
ring cua-base cus-load format-spec battery dbus xml sendmail mail-utils
.loaddefs benny-tools autoload radix-tree lisp-mnt mail-parse rfc2231
rfc2047 rfc2045 mm-util ietf-drums mail-prsvr edmacro kmacro info
package browse-url url url-proxy url-privacy url-expand url-methods
url-history url-cookie url-domsuf url-util mailcap url-handlers
url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs
password-cache json subr-x map url-vars seq byte-opt gv bytecomp
byte-compile cconv cl-loaddefs cl-lib iso-transl tooltip eldoc paren
electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite emoji-zwj charscript charprop case-table
epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice
button loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 273770 13520)
 (symbols 48 18619 1)
 (strings 32 66582 2920)
 (string-bytes 1 2318045)
 (vectors 16 39996)
 (vector-slots 8 1131973 174560)
 (floats 8 762 66)
 (intervals 56 1039 60)
 (buffers 992 50))

--- End Message ---
--- Begin Message --- Subject: Re: bug#61005: 28.1.91; Encoding not detected in HTML files inside archives Date: Sun, 22 Jan 2023 16:09:47 +0200
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Date: Sun, 22 Jan 2023 14:24:07 +0100
> 
> The promised patch.  This is against master.
> 
> Also a small test-suite for sgml-html-meta-auto-coding-function, if you
> want that.  If you care, I could also add one for
> sgml-xml-auto-coding-function.

Thanks, I installed this on the emacs-29 branch, and I'm closing the
bug.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]