[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Cyrillic (cp1251) support for groff -Tps
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] Cyrillic (cp1251) support for groff -Tps |
Date: |
Tue, 24 Jul 2001 18:01:34 +0200 (CEST) |
> Use the \N'nnn' mechanism in groff (you have to in this case
> since the mapfile was empty so there are no groff names for the
> characters as yet). For this sort of thing I prefer to set up
> macros, maybe job-specific, which invoke an IPA context and then
> define characters like
>
> .char \[ng] \N'78'
>
> Or (say using .tr IPA IPA_T ) simply
>
> .char \[ng] \f[IPA]\N'78'\fP
>
> which would enable you to drop in an IPA character on the fly
> without having to switch to an IPA context. Also, you can easily
> name them as you please at any time, without pre-empting names
> you might need for something else.
While this may be sufficient for IPA, it is a bad idea to use \N'...'
generally since it mixes up glyph encoding with input encoding. I'm
citing below another mail with Ruslan Ermilov (address@hidden) which
shows possible solutions to support koi8-r with a preprocessor -- as
you may have known or discovered, groff can't support koi-8 directly
as an input encoding (but only 8859-5) due to `illegal' characters
needed for internal use in groff.
I'm not sure currently whether this affects cp1251 also.
Werner
PS: devkoi8-r is not part of default groff but an extension from
Ruslan. You won't need it to print Russian to a PS printer.
======================================================================
> As a consequence, direct koi8-r input is not possible currently. My
> idea was to use \N'...' (assuming TTY output where groff's output
> encoding is a TTY's input encoding, so to say), but \[...] is better
> of course since it yields a cleaner interface, separating input from
> output encoding. I will eventually remove the hardcoded `charXXX'
> character names since it intermixes input and output encoding. What
> about a converter like this (choose better character names, please):
>
> koi8-r glyph name
>
> 0x80 -> \[bdlh] # box drawings light horizontal
> 0x81 -> \[bdlv]
> 0x82 -> \[bdldr]
> 0x83 -> \[bdldl]
> .
> .
> 0xFF -> \[cyrVe] # cyrillic capital letter ve
>
This is probably the only right, but long-standing solution, how
about this?
Implementing a small preprocessor which merely converts all non-
ASCII input characters to \[charXXX] sequences. It is then made
to be auto-invoked by groff(1) by putting the ``prepro gro8to7''
in the {device}/DESC file.
To speed the things up, the preprocessor should probably only fix
the internally used characters (for which illegal_input_char()
returns true).
I have tried this technique with devkoi8-r, and I get the proper,
warning-free output from the raw (koi8-r) input. (I have tested
all characters in the upper 0x80-0xff range.)
I think this is the best thing we could do before Groff 2.0.