[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] unicode support - where to compose?

From: Werner LEMBERG
Subject: Re: [Groff] unicode support - where to compose?
Date: Wed, 22 Feb 2006 09:46:36 +0100 (CET)

> When an input file contains the character <U+1EBF>, preconv
> transforms it to \[u1EBF], and troff transforms it to a single glyph
> u0065_0302_0301.  Fine.
> But when an input file contains the characters
> <U+0078><U+0302><U+0301>, preconv transforms it to
> x\[u0302]\[u0301], and troff produces three distinct glyphs x,
> u0302, u0301.  This is wrong.


> But should the composition be handled within preconv or within
> troff?  In other words, what should happen if the input file
> contains
>                x\[u0302]\[u0301]  ?

groff doesn't do anything special yet, and I must admit that I haven't
thought about this problem.  Or rather, I've delayed it :-)

> Is groff allowed to combine these three input nodes into a single
> one?

Not yet.

> Or is there some principle in the groff input language that would
> force groff to consider these as three different units?

There isn't such a limit in the input language but in the GNU troff
engine itself.  Currently, groff only recognizes a very limited set of
ligatures (fi, ff, etc.) which can't be extended dynamically.  This
has to be fixed in the future, but it's probably something which
should be done later.

> In the first case I would put the composition into troff.

OK.  With other words, it won't be handled yet.

> In the second case into preconv (i.e. preconv would translate
> <U+0078><U+0302><U+0301> to \[x u0302 u0301] but would leave alone
> x\[u0302]\[u0301]).

This would be perfect.  If I understand you correctly, your approach
will be table-driven, this is, a combining character following a base
character will automatically be converted to the \[xxx yyy ...] form,


reply via email to

[Prev in Thread] Current Thread [Next in Thread]