groff
[Top][All Lists]

## Re: [Groff] Spanish hyphenation

 From: Ted Harding Subject: Re: [Groff] Spanish hyphenation Date: Thu, 07 Sep 2000 23:13:04 +0100 (BST)

On 07-Sep-00 Werner LEMBERG wrote:
>
> This will fail for all patterns which contain TeX characters defined
> as `^^xx' since gtroff will interpret this literally.  Note that
> gtroff's capabilities to scan hyphenation patterns are very limited.
> For example, in the German hyphenation patterns, macros are used in
> the \patterns command!  gtroff can't handle this at all.  Everything
> which doesn't start with a `%' sign will be treated as a valid pattern
> (even including `\patterns{', and `}').
>
> To make TeX hyphenation patterns work with gtroff, replace all `^^xx'
> with real 8bit characters, resolve all macros in the \patterns group,
> and remove everything but the patterns themselves (and comments).  Of
> course, don't forget to properly set up `.hcode' (in gtroff).

Thanks, Werner, that throws a good bit of light on it.

So, for instance, where I see in sphyph.tex

\catcode`\^^e1=11\uccode`\^^e1=`\^^c1\lccode`\^^e1=`\^^e1

it seems that in the line

\patterns{
....................
^^e11d ^^e12d. ^^e11f ^^e12f. ^^e11g ^^e12g. ^^e11h ^^e12h.

I should replace ^^e1 with ASCII e1=225 (á), etc., according to
"lcode (the "Lower Case code"?) and similarly further down the
patterns replace ^^c1 with ASCII c1=193 (Á); and so on for the
other cases in the "catcode" lines? Then strip out everything
except what is between the {...} in \patterns{...}?

And in the case of sphyph.tex (or eshyph.tex, whichever it may be),
is that it? I'm not seeing anything excepr "catcode" lines,
and lines inside \patterns{...}.

regarding ".hcode":

.hcode c1 code1 c2 code2 ...
Set  the  hyphenation code of character c1 to code1
and that of c2 to code2.  A hyphenation  code  must
be  a single input character (not a special charac­
ter) other than a digit or a space.  Initially each
lower-case  letter has a hyphenation code, which is
itself, and each upper-case letter  has  a  hyphen­
ation  code  which  is  the  lower  case version of

Might this mean that letters with ISO-Latin1 codes > 128 are already
initialised as above for .hcode, or only the letters [a-zA-Z]?
If the latter, should it then be

.hcode á á Á á ...

According to the code in groff, it seems that this depends on the
behaviour of the C functions isascii and isalpha which may be
locale-dependent; experiment with these functions on my system says:

a       ASCII = 1       ALPHA = 1
A       ASCII = 1       ALPHA = 1
á       ASCII = 0       ALPHA = 0
Á       ASCII = 0       ALPHA = 0

so it defintiely looks like the latter ([a-zA-Z] only).

Then is it necessary to bother about the cases in \patterns{...}
where there are uppercase characters, since hcode seems to map UC down
to LC for hyphenation?

Thanks again,
Ted.

--------------------------------------------------------------------