groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] german localisation


From: joerg van den hoff
Subject: Re: [Groff] german localisation
Date: Tue, 25 Nov 2003 17:42:01 +0100
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.5) Gecko/20031007

Werner LEMBERG wrote:



question: is there a canonical way to transform the original tex hyphenation pattern files to groff compatible versions without knowing how to do macro expansion and the like in the tex files?

No.  Most TeX pattern files can now be used as-is, but the German
patterns are an exception.

is there a awk or perl script around to do so? or are groff
compatible hyphenation pattern files (except for us english)
somewhere around (I did'nt find any)?

It's quite easy, just use the following sed expressions within the
\patterns group:

 s/\n{\(.*\)}/\1/
 s/\c{.*}//
 s/"a/ä/
 s/"o/ö/
 s/"u/ü/
 s/\3/ß/

(actually, you can use it for the whole file, but it looks a bit
strange then\).


   Werner
thank you very much. apart from the umlaut-substitution, the script is to extract the pattern from all blank separated constructs starting literally with "\n" and enclosing the pattern in {}, and should remove"\c" constructs, right? if so, I think the script should read

sed -e '
s/\\c{[a-zA-Z0-9".\]*}//g
s/\\n{\([a-zA-Z0-9".\]*\)}/\1/g
s/"a/ä/g
s/"o/ö/g
s/"u/ü/g
s/\\3/ß/g


1. the substitution needs to be global if the input has more than one pattern per line (as is the case for the 'dehyph[nt].tex' I found on some server.
2. the backslashes have to be masked in front of [nc3]
3. ".*" does not work everywhere to identify the pattern. I don't remember what exactly is hit by "." but a "\" seems not be included, at least. and I think a blank is a hit, therefore to successive patterns are not handled as to separate entities. anyway some patterns 'slip through'. I replaced the "." by an explicit list of characters possible in the pattern (is this complete?). 4. the first 2 lines need to be reversed because there are nested \n{\c{...}} constructs which are otherwise missed.

maybe there is a simpler solution, but this version of the script seems to do the job.
thanks again.

joerg


reply via email to

[Prev in Thread] Current Thread [Next in Thread]