[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Problems with unwanted unicode.

From: Tomohiro KUBOTA
Subject: Re: [Groff] Problems with unwanted unicode.
Date: Wed, 18 Jul 2001 02:02:15 +0900
User-agent: Wanderlust/1.1.1 (Purple Rain) EMY/1.13.8 (Tastes differ) FLIM/1.13.2 (Kasanui) APEL/10.2 Emacs/20.7 (i386-debian-linux-gnu) MULE/4.1 (AOI)


At Sun, 15 Jul 2001 17:23:25 +0200 (CEST),
Werner LEMBERG <address@hidden> wrote:

> > Then, could you think about tentatively integrate Japanese patch
> > into official version of Groff?
> I would really like to avoid it.  Main reason is that the Japanese
> patch basically bypasses the way how groff processes input.  It
> doesn't extend the structures by making it 32bit-aware but adds new
> variables instead.

I see.  However, the Japanese patch implements Kinsoku and other
processings which would be needed for Groff 2.0.  Even if you
cannot accept the way of Japanese patch around encoding handling,
many parts of the patch would be helpful.

Also I should mention that yesterday a new version of Japanese
patch was released (though I think it is experimental).  It is
found at (please check
groff_1.17.2-1.ukai.1.diff.gz) and it seems to be more
i18n-oriented (<--> Japanese hard-coded).

> Another problem is how to handle fonts in groff.  It seems that we
> have to extend the current font file syntax to allow ... what exactly?
> Please make suggestions.  Some ideas:
>   . Glyph classes.  We'll need that for defining CJK metrics and kerns
>     in a compact way.

All Ideographs are fixed-width.  (I think kerning of Ideographs mentioned
in Ken Lunde's CJKV book p362 is not needed.  Even Japanese version of
TeX [NTT version and ASCII version] doesn't support this.)
Korean Hangul is also fixed-width in traditional typesetting.  However,
I recently heard about a new way of typesetting that elements of Hanguls
are written in fixed size.

Excluding Ideograph and Hangul, there are not so many glyphs for CJK.

>   . Subfonts.  [I don't mean the splitting of a large font into
>     entities with, say, 256 glyphs each.]  Using a single Unicode font
>     is nonsense.  Instead, groff should provide a means to group fonts
>     into Unicode blocks.  If a new font is registered, it sets some
>     flags to indicate which block is covered.  A mechanism to provide
>     a fallback font is neeed also.  It could also be used for the
>     other way round:  Registering a huge font as many subfonts makes
>     the loading much faster.

I understand.  Different scripts have different category of typeface
and thus a single Unicode font cannot be very appropriate.  For Latin
script, Roman, Helvetica, Courier, and so on are avaiable while
Mincho, Gothic, and so on are available for CJK Ideogprah.  Also,
new typefaces are developed even now.

A concept of fontset which is a set of subfonts may be introduced.
For example, Roman typeface and Mincho typeface would be regarded
to construct a fontset.

The existing typeface commands (such as .fI and so on) should change
typefaces for all subfonts.  Some special commands to specify one
element in the fontset, i.e., one subfont, might be introduced.

>   . Interaction between fonts to implement effects like kinsoku shori
>     (i.e., less space between a CJK and non-CJK character than between
>     two non-CJK characters).  It's not completely clear to me how to
>     achieve that in a simple way.

Kinsoku means inhibited characters at the (end/start) of lines.
For example, a Japanese character which corresponds to "(" cannot
exist at the end of a (visual) line.

I think the small space between a CJK and non-CJK character is
kerning.  Anyway, I think these functionalities are implemented
in the Japanese patch.

久保田智広 Tomohiro KUBOTA <address@hidden>
"Introduction to I18N"

reply via email to

[Prev in Thread] Current Thread [Next in Thread]