[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] ASCII dash in UTF-8 locale

From: Ingo Schwarze
Subject: Re: [Groff] ASCII dash in UTF-8 locale
Date: Fri, 23 Jan 2015 16:35:52 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Carsten,

Carsten Kunze wrote on Fri, Jan 23, 2015 at 03:59:41PM +0100:

> is there a way (except \N'45') to output an ASCII dash (0x2d)
> with nroff in an UTF-8 locale (i.e. without -T or with -Tutf8)?

>From font/devascii/R, you see that in -Tascii mode, all of the
following glyph names map to ASCII 0x2d: - \- \(en \(hy \(mi

>From src/libs/libgroff/glyphuni.cpp, you see that in -Tutf8 mode,
no glyph name maps to code point U+002D.  There is a comment, though:

 // `-' and `hy' denote a HYPHEN, usually a glyph with a smaller width than
 // the MINUS sign.  Users who are viewing broken man pages that assume
 // that `-' denotes a U+002D character can either fix the broken man pages
 // or apply the workaround described in the PROBLEMS file.

The PROBLEMS file says:

 * The UTF-8 output of grotty has strange characters for the minus, the
   hyphen, and the right quote.  Why?

 The used Unicode characters (U+2212 for the minus sign and U+2010 for
 the hyphen) are the correct ones, but many programs can't search them
 properly.  The same is true for the right quote (U+201D).  To map
 those characters back to the ASCII characters, insert the following
 code snippet into the `troffrc' configuration file:

 .if '\*[.T]'utf8' \{\
 .  char \- \N'45'
 .  char  - \N'45'
 .  char  ' \N'39'

> I want to copy command lines from nroff output into a xterm,
> but the shell complains about the '-' which is not 0x2d.

That shouldn't happen in manual pages.  Both tmac/an-old.tmac
and tmac/doc.tmac contain:

 .\" For UTF-8, map some characters conservatively for the sake
 .\" of easy cut and paste.
 .if '\*[.T]'utf8' \{\
 .  rchar \- - ' `
 .  char \- \N'45'
 .  char  - \N'45'
 .  char  ' \N'39'
 .  char  ` \N'96'

> E.g. mandoc(1) does output 0x2d for '\-' and even '-' which IMHO
> is better for manpages (to allow copiing text into the shell).

That was originally designed for groff compatibility, and i don't
rue it, the way groff does it indeed seems to make sense.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]