groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Modifying hyphenation rules?


From: Ted Harding
Subject: Re: [Groff] Modifying hyphenation rules?
Date: Thu, 21 Dec 2006 09:45:14 -0000 (GMT)

On 21-Dec-06 Keith Marshall wrote:
> Werner Lemberg wrote:
>> > Using .hw works OK, but like I said, there are hundreds of these
>> > suckers. Is there any way to tell Groff that hyphenating at any
>> > '[a-z][A-Z]' junction is OK?
>>
>> No.  The easiest thing is probably to use sed or awk in a
>> preprocessing step to insert `\%' at the appropriate places
>> automatically.  Since uppercase almost always indicate a possible
>> breakpoint (you aren't writing Gaelic, aren't you? :-) this should be
>> rather straightforward.
> 
> Wow!  Werner's linguistic skills, and knowledge, continue to amaze!
> 
> Presumably here the reference is to the aspirated genitive form of a 
> capitalised noun, as in `Uiscebhealaí Intíre na hÉireann', (Inland
> Waterways of Ireland), where we clearly see a lower case letter
> immediately followed by a capital, where hyphenation would not be
> appropriate.  But, even in English, we see a similar construct; it
> appears in the anglicisation of Gaelic surnames, particularly Irish
> Gaelic surnames.  For example, I'm told that my own Gaelic surname
> would be `mac Donald', (Son of Donald), which is commonly anglicised
> to McDonald.

There is a fairly plausible case for permitting hyphenation as Mc-
Donald (perhaps stronger when written in the alternative form Mac-
Donald) since it splits the word at a "semantic boundary".

However, there's another trap if you 'groff' Scottish Gaelic.

Consider the topographical name "Coire an t-Sneachda" (literally
"the corrie of the snow". "Snow" on its own is "Sneachd", pronounced
roughly as "Snechg" with German-like "ch". But the spoken language
'does not like' the "nSn" sequence in "an Sneachda" so the "S" mutates
to a "T" (and consequently the "n" to an "r"-like sound), and the
standard orthography represents simultaneously the root "Sneachd"
of the word and the mutation to "T" by prepending "t-".

So, if "written as spoken", it might be "Coire an Treachda".

I give this example because the hyphenation is a "meta-spelling",
as explained, and does not in any way indicate a compound of the
the hyphenated constituents. So splitting at the hyphen would
not make sense.

Now 'groff' the text

  Push this to the end of the line: Coire an t-Sneachda

in large enough point-size (I used 24-point on a 6-inch line
in Times Roman), and you will get

  Push this to the end of the line: Coire an t-
  Sneachda

because of the presence of the hyphen; and you probably would
not want this.

You could suppress it with something like

  Push this to the end of the line: Coire an t-\h'0'Sneachda

(exploiting groff's rule that words with escape sequences
generally do not get hyphenated -- the reasons for which
I've never discovered, by the way -- it often causes problems,
and rarely seems to bring benefits).

This isn't the only such example. There are many words which
begin with "Sr...", for instance, where the same can happen,
so the spelling can become "... t-Sr... ". And it's not just
S->T either!

So a groff hyphenation file for Scottish Gaelic could look
quite interesting ... Personally, I'd be tempted to turn
hyphenation off at least some of the time!

Best wishes, and Heppy Festive Season to all!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <address@hidden>
Fax-to-email: +44 (0)870 094 0861
Date: 21-Dec-06                                       Time: 09:43:48
------------------------------ XFMail ------------------------------




reply via email to

[Prev in Thread] Current Thread [Next in Thread]