[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40794: 26.3; HTML entities ☆ and ★ (inter alia) are not p
From: |
Lars Ingebrigtsen |
Subject: |
bug#40794: 26.3; HTML entities ☆ and ★ (inter alia) are not parsed by libxml-parse-html-region |
Date: |
Wed, 29 Jul 2020 07:26:15 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Tim Landscheidt <tim@tim-landscheidt.de> writes:
> (Prologue: This bug showed up in the "ALT" attribute of an
> "IMG" element of an HTML mail in Gnus. I am reasonably cer-
> tain that this stems from libxml-parse-html-region and
> should be fixed there, but there may be more prudent solu-
> tions.)
[...]
> These should instead yield "ä" (228), "☆" (9734) and
> "★" (9733).
>
> lisp/leim/quail/sgml-input.el seems to contain the necessary
> data for ☆ and ★ that could probably be fed to
> libxml.
As far as I can tell, libxml2 doesn't take a list of entities as an
input when parsing HTML? I may have missed something...
Hm, a bit of googling shows http://xmlsoft.org/html/libxml-entities.html
and there is apparently a way to tell libxml2 about further entities?
But I think this all sounds more like a libxml2 than an Emacs bug,
really?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
- bug#40794: 26.3; HTML entities ☆ and ★ (inter alia) are not parsed by libxml-parse-html-region,
Lars Ingebrigtsen <=