emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Usage of standard-display-table in MSDOS


From: Kenichi Handa
Subject: Re: Usage of standard-display-table in MSDOS
Date: Mon, 06 Sep 2010 14:14:01 +0900

In article <address@hidden>, "Ehud Karni" <address@hidden> writes:

> I attach a tar.bz2 file with 3 files:
> 1. lit1 - the sample file.
> 2. lit1-tty.png - how it should show on text terminal.
> 3. lit1-x.png   - how it should show on X.

> I can do it if I read the file with the iso-latin-1 coding-system
> and change the display table to show the Hebrew glyphs for the Hebrew
> [#xE0-#xFA] bytes. But in this way it is not Hebrew characters (e.g.
> for the new bidi display). I want it the other way around, to read it
> with hebrew-iso-8bit and to to tweak the display table to show all
> the bytes not belonging to the Hebrew set.

Does it mean that you want bidi-reordering for the bytes
#xE0..#xFA (code-points of iso-8859-8) but bidi-reordering
is not necessary for the bytes #x80..#x8A (code-points of
cp862)?

But, your file "lit1" contains #xE0..#xFA (code-points of
iso-8859-8) at the second to 4th lines in visual order.  If
bidi-reordering is applied on them, you'll get the different
view than lit1-tty.png and lit1-x.png.  Is that ok?

> I had similar problem a long time ago. In 2001 you suggested to use
> the following code:

>   (make-coding-system
>       'hebrew-iso-8bit 2 ?8
>       "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)"
>       '(ascii hebrew-iso8859-8 nil nil
>               nil ascii-eol ascii-cntl nil nil nil nil nil t)
>       '((safe-charsets ascii hebrew-iso8859-8 eight-bit-control)
>         (mime-charset . iso-8859-8)))

> May be I can define a new coding system that will have bytes #x80-#xFF
> as legal characters and be recognized as Hebrew variant.

This code will that.  I think it's not difficult to
understand what the code is doing.

------------------------------------------------------------
(define-charset 'cp862-sub
  "Subset of CP862"
  :code-space [#x80 #xDF]
  :subset '(cp862 #x80 #xDF #x00))

(define-charset 'iso-8859-8-sub
  "Subset of ISO-8859-8"
  :code-space [#xE0 #xFA]
  :subset '(iso-8859-8 #xE0 #xFA #x00))

(define-coding-system 'mix-hebrew
  "Mixture of ISO-8859-8 and CP862"
  :mnemonic ?H
  :coding-type 'charset
  :charset-list '(ascii iso-8859-8-sub cp862-sub)
  :ascii-compatible-p t)
------------------------------------------------------------

Please try C-x C-m c mix-hebrew RET lit1 RET.

But, if you do that, you must consider the problem Eli wrote:

In article <address@hidden>, Eli
Zaretskii <address@hidden> writes:

> But if you want all the Hebrew characters to be treated by Emacs as
> such (e.g., for bidi display), no matter what's their encoding in the
> file, you will have to define a coding-system that will decode them
> all into Unicode codepoints of Hebrew characters.  There's a problem
> you will need to solve for defining such a coding system: it has 2
> different encodings for the same character, one from hebrew-iso-8bit,
> the other from cp862.  So you will need to decide how will Hebrew
> characters be encoded when the file is saved.

In the above definition of mix-hebrew, as iso-8859-8-sub is
listed before cp862-sub, all Hebrew characters are encoded
into bytes #xE0..#xFA even if they were originally decoded
from bytes #x80..#x9A.

If you don't like it, you must give up decoding bytes
#x80..#x9A into Hebrew chars.  You decode them as raw-bytes,
and setup a display table to display them as Hebrew chars.
It can be done by this code:

------------------------------------------------------------
(define-charset 'cp862-sub
  "Subset of CP862"
  :code-space [#x9B #xDF]
  :subset '(cp862 #x9B #xDF #x00))

(define-charset 'iso-8859-8-sub
  "Subset of ISO-8859-8"
  :code-space [#xE0 #xFA]
  :subset '(iso-8859-8 #xE0 #xFA #x00))

(define-coding-system 'mix-hebrew
  "Mixture of ISO-8859-8, CP862, and raw 8-bit bytes"
  :mnemonic ?H
  :coding-type 'charset
  :charset-list '(ascii iso-8859-8-sub cp862-sub eight-bit)
  :ascii-compatible-p t)

(require 'disp-table)
;; Display bytes #x80..#x9A as Hebrew chars (code-points #xE0..#xFA of
;; ISO-8859-8).
(dotimes (i #x1B)
  (aset standard-display-table
        (unibyte-char-to-multibyte (+ #x80 i))
        (vector (decode-char 'iso-8859-8 (+ #xE0 i)))))
------------------------------------------------------------

This display-table setting works also on terminal as far as
you set terminal coding system to mix-hebrew.

---
Kenichi Handa
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]