groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] .hw request and composite words in germanic languages


From: Werner LEMBERG
Subject: Re: [Groff] .hw request and composite words in germanic languages
Date: Tue, 12 Mar 2002 06:11:06 +0100 (CET)

> > I doubt this.  It is possible that the TeX patterns disable
> > hyphenation there, i.e., the pattern is only tillå-ta, but it is
> > not possible to insert a \descretionary command into hyphenation
> > patterns.  Maybe you have to say ti"llåta, similar to German's
> > active `"' character to get the proper behaviour.
> 
> To be honest, I might have misunderstood him. I haven't looked more
> into this after my mail. Before posing my question, I created a
> document with this word, Then added things words before "tillåta"
> until it reached the margin. Groff refused to hyphenate it, so
> (since this is a fairly common word -- meaning "to allow") it is
> most likely that he have added patterns disallowing erroneous
> hyphenation. He says in the changelog:
> 
> % 1991-11-01: Added another some 6200 compound words, all of which were
> %             incorrectly hyphenated by the old patterns.

Indeed, this is what I think also -- the added patterns handle
incorrect hyphenation by disabling them.

> Before the actual patterns, there is an interesting paragraph (see
> below).  I removed that before copying the thing to
> ../groff/1.18/tmac/hyphen.se, supposing that this wasn't really
> Swedish but Klingon (possibly uuencoded).

:-)

> % Set \catcode, \uccode, and \lccode for the Swedish letters.
> % This should be done for all letters, really.
> \catcode`^^c5=11 \catcode`^^c4=11
> \catcode`^^d6=11 \catcode`^^c9=11 \catcode`^^e5=11 \catcode`^^e4=11
> [...]

The \catcode command changes the `category' of a character in TeX.
Normally, only ASCII characters are in category 11 (character); most
other codes are in category 12 (other).  It is possible that some
macro packages have changed this (e.g. to category 13, active
characters).  The above lines assures that all characters needed for
Swedish are in the right category.  Setting \uccode and \lccode is
similar to groff's hcode request; only characters with a non-zero
hyphenation code are subject to hyphenation.  Additionally, uppercase
characters are converted to lowercase before applying hyphenation
patterns.

Note that the encoding of the hyphenation patterns in TeX must be
identical to the font output encoding, while in groff the patterns are
handled as input character codes.  In many cases this isn't relevant,
but sometimes it makes a difference (have a look into dehyphn.tex, for
example -- these patterns contain macros and can't be used by groff
without modification, i.e., the macros must be expanded).


    Werner

reply via email to

[Prev in Thread] Current Thread [Next in Thread]