Charset hiccups with HTML mails display

From: Tim Landscheidt
Subject: Charset hiccups with HTML mails display
Date: Fri, 11 Jun 2010 21:35:12 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)


I have been having for quite some time the problem of HTML
mails not properly displaying due to charset hiccups (Gnus
v5.13/GNU Emacs 23.1.1 here) that I need to tackle now.

  The issue is easily explained: A message consisting of
(only relevant header lines):

| MIME-Version: 1.0
| Content-Transfer-Encoding: 8bit
| Content-type: text/html; charset="iso-8859-1"

| <html>
|   <body>
|     <p>Test: &Auml;</p>
|   </body>
| </html>

should display "Test: Ä" in some way, but instead it says:
"Test: \303\204" (i. e. "Ä" encoded in UTF-8).

  mm-text-html-renderer is lynx,
mm-text-html-renderer-alist's lynx entry is (lynx
mm-inline-render-with-stdin nil "lynx" "-dump" "-force_html"
"-stdin" "-nolist"). If I prepend the latter argument list
with "-display_charset=iso-8859-1", instead of "\303\204"
"\304" is rendered (i. e. "Ä" encoded in ISO 8859-1).

  So the problem seems to be that Gnus doesn't accept Lynx's
output as UTF-8 but as some raw binary. I use shell-command/
shell-command-on-region on a daily basis, so the source of
the problem does not lie with Emacs in this case.

  Any ideas?


