[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] handling of composing and combined Unicode characters
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] handling of composing and combined Unicode characters |
Date: |
Tue, 10 Jan 2006 13:56:41 +0100 (CET) |
> > Either you register `u0045_0302_0301' with .char directly in your
> > document (or in a proper macro file, say, `vi.tmac'), or you add
> > this to the devutf8 font description files.
>
> I want to get rid of the description files for everything that is
> related to Unicode characters and can be derived algorithmically.
Me too, basically.
> If every invocation of troff and of grotty has to parse such a large
> file, the startup time will be in the range of several seconds,
> which is prohibitive.
Please always bear in mind that groff is actually a typesetting
program, not a `man' filter! TTY output is handled similarly to other
decives like PS. This has advantages, but also some disadvantages.
What you want to do probably needs a lot of `Extrawürschte' just for
TTY output, and this must be done very carefully since my main goal is
to have good support for Unicode for all output devices.
> But before discussing the implementation, I'd like to agree with you
> on the first intermediate goal: the output of 'troff'.
>
> 1) Should the output of 'troff' contain Unicode characters or glyphs?
> I.e. u0045 then u0302 then u0301, or u0045_0302_0301 as a single
> entity?
groff always contains glyphs, similar to TeX, so you have
u0045_0302_0301.
> 2) Do you agree that the case u0078_0302_0301 should be handled the
> same way as u0045_0302_0301? (For one of them the precomposed
> character is contained in Unicode, for the other it isn't.)
Yes. Normalization form D will always applied.
> Maybe the answers to these two questions depends whether the next
> program in the pipe is 'grotty' or not? (I hope it's independent.)
The output is independent.
> Also, is src/roff/troff/input.cpp:composite_glyph_name() one of the
> functions involved here, or is my understanding of the code completely
> wrong?
composite_glyph_name() is used for the \[aaa bbb ccc] escape sequence.
Werner