Re: utf8 char display in buffer

From: B. T. Raven
Subject: Re: utf8 char display in buffer
Date: Fri, 12 Jun 2009 11:48:55 -0500
User-agent: Thunderbird (Windows/20090302)

Lewis Perin wrote:
ken <address@hidden> writes:


Thanks for posting.  It's lonely out there when you're the only one with
a particular problem.

The few, the proud...

To make sure we're suffering the same cyber-indignity, here's the
scenario as I see it (from an older version of emacs running on

0) Some others and myself want to include some non-English characters in
a file being edited in emacs. Problems arise, however:

1) In a buffer which is already utf-8 encoded, I set the appropriate
input method, type in the desired characters. They display just peachy
and there is happiness in EmacsLand.

2) I save the buffer to a file, then close the buffer.

3) I visit the same file (i.e., load it again into emacs). Because it
has &lt;!-- -*- coding: utf-8; -*- --&gt; as the first line, it opens
utf-8 encoded. This is confirmed by the presence of a 'u' as the second
character in the status bar.

I haven't been inserting that special first line.

4) The text in the buffer displays fine, except that in place of each of
those non-English characters is a little empty box. With the cursor on
one of those boxes, an 'a' with a horizontal bar above it, doing "C-x
=", emacs returns "Char: ā (01210041, 331809, 0x51021, file ...)".
(While, in emacs the character after "Char:" is a little box, if I load
this same file into Firefox, that same character appears as it should,
as an 'a' with a horizontal bar above it. How it appears in your email
client will depend upon your email client.)

My situation differs in that most of the non-ASCII characters (Chinese
in my case) come through just fine.  But the ones that don't have
those irritating boxes in place of the correct glyphs.

Lew Perin / address@hidden

I wouldn't be surprised if the gaps and overlaps in the CJK ranges of glyphs weren't so complicated that many characters from the following encodings may not be included in utf-8, especially if they are not precomposed. Try some of these encodings to see if some of the empty boxes are resolved into characters:


Also it might help to install a fontset rather than depending on a single font to represent all these characters. Unfortunately I can't help with that. I am on w32 and I don't even know whether fontsets can be used in Emacs on that build.


