Re: Usage of standard-display-table in MSDOS

From: Ehud Karni
Subject: Re: Usage of standard-display-table in MSDOS
Date: Fri, 27 Aug 2010 16:35:40 +0300

On Thu, 26 Aug 2010 19:43:48 Eli Zaretskii wrote:
> > From: "Ehud Karni" <address@hidden>
> >
> > No, I want Hebrew of any kind - DOS(CP862), UNIX (ISO-8862-8) and UTF
> > to appear in Hebrew on BOTH text terminals and X.
> Sorry, I don't understand: what do you mean by "Hebrew of any kind"?
> In Emacs 23 and later, there's only one kind of Hebrew: the Unicode
> kind.  All the characters, including Hebrew, are internally
> represented as their Unicode codepoints.  When Emacs visits a file
> encoded in cp862, it converts the encoded characters into their
> Unicode codepoints.  What is delivered to the screen is either some
> encoding, like cp862 (in the case of a text terminal), or a glyph from
> some font (on GUI terminals).  In both of these cases, Emacs
> translates the Unicode codepoints to either the corresponding cp862
> etc. codes, or to the codes of the characters in the font used to
> display Hebrew.  All that's needed for Emacs to DTRT is (a) that Emacs
> knows it is dealing with Hebrew characters, and (b) for text terminals
> only, that the terminal encoding is set up according to the encoding
> the terminal expects.
> Now, what am I missing to understand why you needed to use display
> tables?

You missing the point that most of my files are not "word-processor"
(or HTML/XML) files but are data file that are either read as ISO-8859-8
or no-conversion (binary) encoding.

Now, some of them has DOS Hebrew (#x80-9A) and graphic characters in
them, in ADDITION to UNIX Hebrew (#xE0-FA). I still want to see it as
Hebrew characters (so I can read it) but with a distinction between the
2 Hebrew types, I want to know the 8-bit encoding, it matters.

When I visit a file literally (i.e. no conversion) I still want to see
the Hebrew (and DOS graphic) characters as Hebrew and graphics, not as
an octal representation.

So I have to use a display table, and I want it to work for both text
terminals and X (or other windowed system - Mac, MS - which I myself
don't use).

> These graphic characters are part of Unicode as well (in the U+25XX
> block), and Emacs 23 knows how to encode them in cp862, or any other
> codepage that supports these characters.  Try "C-x 8 RET 2525 RET" and
> see for yourself, it has a valid cp862 encoding.

What I want is just a subset of this in my display table, so bytes in
the range #xB0-#xDF will be shown as is on text terminal and as the
CP862 glyphs on X (I am willing to have different display tables for
each case, I don't use text terminal and X on the same Emacs instance).

I know how to do it when the locale environment is set to "en_GB".
Can you instruct me how to do this when the locale environment is set
to "he_IL" ?

Just as curiosity, some times I get files where the Hebrew is encoded
as the lower Latin letters and Aleph is represented by @ (this is
known as old-code and it is still used by some companies, even though
in is some other applications already use UTF-8 XML files).

Do you have a way to display it as Hebrew without a display table ?


 Ehud Karni           Tel: +972-3-7966-561  /"\
 Mivtach - Simon      Fax: +972-3-7976-561  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 GnuPG: 98EA398D <http://www.keyserver.net/>    Better Safe Than Sorry

