[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] address@hidden: Man_db UTF-8 issues]
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] address@hidden: Man_db UTF-8 issues] |
Date: |
Fri, 04 Jan 2002 23:28:17 +0100 (CET) |
> The problem lies in the character minus or dash or whatever you call
> it. In ASCII, you always use the same character, namely - (ascii
> 45). This is called a HYPHEN-MINUS in UTF-8 because it cannot be
> decided whether it is a HYPHEN or a MINUS. This character is used
> eg. in mail addresses (...contact address@hidden) and
> options (--verbose).
>
> However, nroff renders an input character `-' as [...] a HYPHEN. It
> also renders `\-' as [...] a MINUS. So far, so good, nroff is a
> document formatting system.
>
> Unfortunately this means that [...] Those renderings look quite
> right, but aren't. You can't copy and paste them, for example.
I believe that this is a problem of the cut-and-paste mechanism in
general. Other programs will be equally affected. A special mode
would be necessary which maps Unicode to ASCII.
> Whenever the generated formatted manpage is not immediately destined
> for any kind of print output (like dvi or ps), it would probably be
> best to generate HYPHEN-MINUS for both `-' and `\-'. This could be
> an nroff change. It could also be another filter just behind nroff
> which transforms MINUS and HYPHEN back to HYPHEN-MINUS. (Note that
> the aesthetically pleasing short hyphens at the hyphenation points
> nroff inserts are not affected as those are SOFT HYPHENs.)
You can easily customize the output of `-' and `\-': Add
.tr -\N'45'
.tr \-\N'45'
to the files `man.local' and `mdoc.local'. The use of \N is necessary
since (intentionally) U+002D is not assigned a entity name.
Werner