Re: [groff] hyphen, minus sign and hyphen-minus

From: Ingo Schwarze
Subject: Re: [groff] hyphen, minus sign and hyphen-minus
Date: Mon, 28 May 2018 02:48:09 +0200
Hi Pali,

Pali Rohar wrote on Sun, May 27, 2018 at 11:52:44PM +0200:

> Now I looked deeply at man -Tps output and basically \- sequence is
> written as character 0xAD (\255 in octet) into output postscript file.
> Therefore it is SOFT HYPHEN (U+00AD),

No, that is not a "soft hyphen".  Glyph numbers in fonts used for
PostScript output have nothing to do with Unicode code points.
Look at the file font/devps/TR for examples:

PS name      TR#   Unicode
-------      ---   -------
asciicircum  0x00  U+005E
asciitilde   0x01  U+007E
Scaron       0x02  U+0053 U+030C
Zcaron       0x03  U+005A U+030C
scaron       0x04  U+0073 U+030C
zcaron       0x05  U+007A U+030C
Ydieresis    0x06  U+0059 U+0308
trademark    0x07  U+2122
quotesingle  0x08  U+0027
Euro         0x09  U+20AC
hyphen       0x2d  U+2010
circumflex   0x5e  U+02C6
quoteleft    0x60  U+2018
tilde        0x7e  U+02DC
bullet       0x83  U+2022
florin       0x84  U+0192
minus        0xad  U+2212

and so on and so forth, it's completely different all over the place.

> so it is incorrect for command line switch.

It is not incorrect.  The TR font does not contain a glyph for
hyphen-minus, so plain minus is used as a fallback.

> I looked deeply at DVI output and basically \- is printed in DVI as
> glyph with index 0 from font cmsy10. Looking at cmsy10.pfb Type1 font
> file and there is "dup 0 /minus put" which means that this character is
> mathematical minus with Unicode code point U+2212.

That could well be the same effect as for PostScript.  I don't have
the slightest idea how the dvi file format works (even though i used
LaTeX a lot in the past).  Is the wish "i want to print hyphen-minus"
reasonable for dvi?  It might be just as meaningless as for PostScript
with the default TR font.

> So the result is that "preferred" way for writing command line switches
> in manpage via \- sequence is broken for -Tps and -Tdvi.

Sure, for PostScript, that breakage is expected and can't be fixed.
No idea with respect to dvi.

> Is somebody going to fix PS

That's impossible.

> and DVI output to be "compatible" with "preferred" way how
> to write hyphen-minus in manpages?

I don't know about dvi.

HTML could probably be fixed if somebody cared enough to track down
the root cause of the bug, though i'm not completely sure how that
works internally, so maybe i'm too optimistic.


