Hello,
I followed your instructions, and I think I made some progress.
When I reopened the file containing cyrillic characters with language environment = utf-8, I obtained the following results for describe-char:
character: b (98, #o142, #x62, U+0062)
charset: ascii (ASCII (ISO646 IRV))
code point: #x62
syntax: w which means: word
category: a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0]) l:Latin
buffer code: #x62
file code: #x62 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Bitstream Vera Sans Mono-bold-r-normal-normal-16-120-96-96-c-*-iso8859-1
(#x62)
character: б (332881, #o1212121, #x51451, U+0431)
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: #x28 #x51
syntax: w which means: word
category: y:Cyrillic
buffer code: #x9C #xF4 #xA8 #xD1
file code: #xD0 #xB1 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Bitstream Vera Sans Mono-bold-r-normal-normal-16-120-96-96-c-*-iso10646-1 (#x431)
Comparing this result to yours in your previous message, it would appear that the font is the culprit. Namely I invoke Emacs with the command line options
"C:\Program Files\Emacs\emacs-22.1\bin\runemacs.exe" -g -0 --font "-outline-Bitstream Vera Sans
Mono-bold-r-normal-normal-*-*-96-96-c-*-iso8859-1"
If I invoke Emacs simply with the command line
Emacs
then the describe-char commands yield:
character: b (98, #o142, #x62, U+0062)
charset: ascii (ASCII (ISO646 IRV))
code point: #x62
syntax: w which means: word
category: a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0]) l:Latin
buffer code: #x62
file code: #x62 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Courier New-normal-r-normal-normal-13-97-96-96-c-*-iso8859-1 (#x62)
character: б (332881, #o1212121, #x51451, U+0431)
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: #x28
#x51
syntax: w which means: word
category: y:Cyrillic
buffer code: #x9C #xF4 #xA8 #xD1
file code: #xD0 #xB1 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Courier New-normal-r-normal-normal-13-97-96-96-c-*-iso10646-1 (#x431)
and the cyrillic characters are clearly visible. However, this still does not exhaust the possible questions. Namely, when I invoke Emacs with the "problematic font" as described above, I can still display cyrillic characters in a new file. Problems arise only when I _reopen_ the file. To investigate this problem I invoked Emacs with
"C:\Program Files\Emacs\emacs-22.1\bin\runemacs.exe" -g -0 --font
"-outline-Bitstream Vera Sans
Mono-bold-r-normal-normal-*-*-96-96-c-*-iso8859-1"
and entered the same lines as before in a new file (even without language environment = utf-8). descirbe-char yields
character: b (98, #o142, #x62, U+0062)
charset: ascii (ASCII (ISO646 IRV))
code point: #x62
syntax: w which means: word
category: a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0]) l:Latin
buffer code: #x62
file code: #x62 (encoded by coding system iso-latin-1-dos)
display: by this font (glyph code)
-outline-Bitstream Vera Sans Mono-bold-r-normal-normal-16-120-96-96-c-*-iso8859-1 (#x62)
character: б (3665, #o7121, #xe51, U+0431)
charset: cyrillic-iso8859-5
(Right-Hand Part of Latin/Cyrillic Alphabet (ISO/IEC 8859-5): ISO-IR-144.)
code point:
#x51
syntax: w which means: word
category: y:Cyrillic
buffer code: #x8C #xD1
file code: not encodable by coding system iso-latin-1-dos
display: by this font (glyph code)
-outline-Arial-bold-r-normal-normal-16-120-96-96-p-*-iso8859-5 (#x431)
In this case the cyrillic characters are visible
THUS IT WOULD APPEAR THAT IN THIS CASE EMACS IS ABLE TO SELECT A SUBSTITUTE FONT THAT RENDERS THE CHARACTERS CORRECTLY. WHY DOES IT NOT DO SO WHEN THE FILE IS REOPENED?
Regards,
Bostjan
----- Original Message ----
From: martin rudalics <rudalics@gmx.at>
To: Bostjan Vilfan <bvilf@yahoo.com>
Cc: Bug-Gnu-Emacs
<bug-gnu-emacs@gnu.org>
Sent: Wednesday, November 21, 2007 8:23:07 AM
Subject: Re: Problem with multilingual input?
> On Windows I tried your suggestion (set-language-environment) and
the
> result was the same (empty rectangles). Then I selected
> Options->Mule-Show All of Mule Status and read off the current
> language environment as UTF-8. Thus, language environment equals
utf-8
> or English does not influence the outcome.
Sorry for the delay. I hoped someone else would respond but apparently
all language environment experts are busy at the moment. Please CC to
bug-gnu-emacs when answering - maybe we'll get qualified help.
> On Linux the outcome is OK (cyrillic characters visible), again
The correct name of this OS is GNU/Linux.
> regardless of the language environment settings (utf-8 or English)
When I have a file saved with mule-utf-8 containing the two lines
bla bla
бла бла
visit the file with `current-language-environment' utf-8 and do
`describe-char' for the first character of the first line I get
character: b (98, #o142, #x62, U+0062)
charset: ascii (ASCII (ISO646 IRV))
code point: #x62
syntax: w which means: word
category: a:ASCII graphic characters 32-126 (ISO646 IRV:1983[4/0])
l:Latin
buffer code: #x62
file code: #x62 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Courier
New-normal-r-normal-normal-16-96-120-120-c-*-iso8859-1 (#x62)
`describe-char' for the first character of the second line gets me
character: б (332881, #o1212121, #x51451, U+0431)
charset: mule-unicode-0100-24ff (Unicode characters of the range
U+0100..U+24FF.)
code point: #x28 #x51
syntax: w which means: word
category: y:Cyrillic
buffer code: #x9C #xF4 #xA8 #xD1
file code: #xD0 #xB1 (encoded by coding system mule-utf-8-dos)
display: by this font (glyph code)
-outline-Courier
New-normal-r-normal-normal-16-96-120-120-c-*-iso10646-1 (#x431)
on WindowsME. What do you get?