groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] handling of composing and combined Unicode characters


From: Werner LEMBERG
Subject: Re: [Groff] handling of composing and combined Unicode characters
Date: Tue, 10 Jan 2006 13:56:41 +0100 (CET)

> > Either you register `u0045_0302_0301' with .char directly in your
> > document (or in a proper macro file, say, `vi.tmac'), or you add
> > this to the devutf8 font description files.
> 
> I want to get rid of the description files for everything that is
> related to Unicode characters and can be derived algorithmically.

Me too, basically.

> If every invocation of troff and of grotty has to parse such a large
> file, the startup time will be in the range of several seconds,
> which is prohibitive.

Please always bear in mind that groff is actually a typesetting
program, not a `man' filter!  TTY output is handled similarly to other
decives like PS.  This has advantages, but also some disadvantages.
What you want to do probably needs a lot of `Extrawürschte' just for
TTY output, and this must be done very carefully since my main goal is
to have good support for Unicode for all output devices.

> But before discussing the implementation, I'd like to agree with you
> on the first intermediate goal: the output of 'troff'.
> 
>   1) Should the output of 'troff' contain Unicode characters or glyphs?
>      I.e. u0045 then u0302 then u0301, or u0045_0302_0301 as a single
>      entity?

groff always contains glyphs, similar to TeX, so you have
u0045_0302_0301.

>   2) Do you agree that the case u0078_0302_0301 should be handled the
>      same way as u0045_0302_0301? (For one of them the precomposed
>      character is contained in Unicode, for the other it isn't.)

Yes.  Normalization form D will always applied.

> Maybe the answers to these two questions depends whether the next
> program in the pipe is 'grotty' or not? (I hope it's independent.)

The output is independent.

> Also, is src/roff/troff/input.cpp:composite_glyph_name() one of the
> functions involved here, or is my understanding of the code completely
> wrong?

composite_glyph_name() is used for the \[aaa bbb ccc] escape sequence.


    Werner




reply via email to

[Prev in Thread] Current Thread [Next in Thread]