groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Re: unicode support, part 14: unicode fonts


From: Werner LEMBERG
Subject: Re: [Groff] Re: unicode support, part 14: unicode fonts
Date: Thu, 10 Aug 2006 08:14:14 +0200 (CEST)

> For Unicode fonts (which ought to be increasingly the norm), the
> proposal to write out all glyph properties in the font file seems
> odd; as far as I understand the point of Bruno's Unicode fonts
> versus enumerated fonts is to avoid the need to write out properties
> in font files which are really properties of the Unicode code
> points.  Can these properties not be autogenerated from
> UnicodeData.txt (and others, e.g.  EastAsianWidth.txt) and used
> automatically for all Unicode fonts?  Glyph classes would then be
> useful for efficient internal storage, but there would be no urgent
> need to represent them in the font files.

Please bear in mind that groff, similar to TeX, don't store character
information; everything is related to glyphs -- I won't accept a
solution which works for a particular device only.  For example, take
a Japanese PS font; you can't safely assume that the font's
`full-width' characters are full-width at all because this gives poor
typographical output.  We *need* glyph classes.  Of course,
EastAsianWidth.txt and other Unicode data files can be used to
autogenerate the font description files for devutf8 and devhtml, but I
don't want to store the data hardcoded in troff.

> It feels like groff is quite close to being able to render CJK
> reasonably well - the major omissions seem to be width handling and
> kinsoku shori (is that an accurate assessment?)

This is correct.

> (In addition, the Debian patches also create an "ascii8" device,
> which is a curious little hack that effectively passes through
> characters encoded according to the current locale - so if the input
> to ascii8 is ISO-8859-2, then you get ISO-8859-2 output.  At
> present, man uses this device for Czech, Croatian, Hungarian,
> Polish, Russian, Slovak, and Turkish.  Obviously this device is
> typographically dubious at best, so I'll replace it by use of
> preconv/soelim/whatever and an iconv postprocessing step;
> latin2.tmac and latin5.tmac would work as well but those appear to
> be largely superseded by preconv.)

latin2.tmac and friends are *not* superseded, you need them for proper
hyphenation.  Have a look at my recent answer to a mail called `koi8-r
hyphenation revisited'.


    Werner




reply via email to

[Prev in Thread] Current Thread [Next in Thread]