bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: nroff char '0255' bug?


From: Werner LEMBERG
Subject: Re: nroff char '0255' bug?
Date: Thu, 03 Jul 2003 18:05:28 +0200 (CEST)

> The byte-code 0xAD is octal 255, and in Latin1 encoding it
> represents a kind of hyphen character. A very easy way to create
> such a file is
> to catch the output of:       echo axb | tr 'x' '\255'
> A full demo is easy as well:  echo axb | tr 'x' '\255' | nroff | more
> or:                           echo axb | tr 'x' '\255' | nroff | od -c
> 
> The output of nroff (1.18) for this input-data shows the 'a' and the
> 'b', but nothing in between. Previous nroff-versions such as 1.16.1
> showed a proper hyphen between the 'a' and the 'b'.

>From the README file:

  o Using the latin-1 input character 0xAD (soft hyphen) for the `shc'
    request was a bad idea.  Instead, it is now translated to `\%',
    and the default hyphenation character is again \[hy].  Note that
    the glyph \[shc] is not useful for typographic purposes; it only
    exists to have glyph names for all latin-1 characters.

So the new behaviour is the correct one (within the groff universe).
The main function of the soft hyphen in groff is to indicate a
possible hyphenation point -- for groff, 0xAD is a special character
by default.

An excellent discussion on this topic can be found here:

  http://www.cs.tut.fi/~jkorpela/shy.html

It has also been recently discussed on the linux-utf8 mailing list.

The conclusion is that you should never use 0xAD...

To change groff's behaviour you can say e.g.

  .tr \[char173]\[char173]
  .trin \[char173]\[hy]

Then a hyphen is printed for all occurrences of 0xAD.


     Werner




reply via email to

[Prev in Thread] Current Thread [Next in Thread]