Re: [groff] hyphen, minus sign and hyphen-minus
Pali Rohár
Re: [groff] hyphen, minus sign and hyphen-minus
Mon, 28 May 2018 15:16:53 +0200
NeoMutt/20170113 (1.7.2)
On Monday 28 May 2018 02:48:09 Ingo Schwarze wrote:
> Hi Pali,
>
> Pali Rohar wrote on Sun, May 27, 2018 at 11:52:44PM +0200:
>
> > Now I looked deeply at man -Tps output and basically \- sequence is
> > written as character 0xAD (\255 in octet) into output postscript file.
> > Therefore it is SOFT HYPHEN (U+00AD),
>
> No, that is not a "soft hyphen". Glyph numbers in fonts used for
> PostScript output have nothing to do with Unicode code points.
> Look at the file font/devps/TR for examples:
>
> PS name TR# Unicode
> ------- --- -------
> asciicircum 0x00 U+005E
> asciitilde 0x01 U+007E
> Scaron 0x02 U+0053 U+030C
> Zcaron 0x03 U+005A U+030C
> scaron 0x04 U+0073 U+030C
> zcaron 0x05 U+007A U+030C
> Ydieresis 0x06 U+0059 U+0308
> trademark 0x07 U+2122
> quotesingle 0x08 U+0027
> Euro 0x09 U+20AC
> hyphen 0x2d U+2010
> circumflex 0x5e U+02C6
> quoteleft 0x60 U+2018
> tilde 0x7e U+02DC
> bullet 0x83 U+2022
> florin 0x84 U+0192
> minus 0xad U+2212
>
> and so on and so forth, it's completely different all over the place.
I'm saying that I generated PostScript file via man -Tps and then looked
into generated PostScript file.
And in PostScript file on place where should command line switch
--something was F2(\255... or F2<ad... \255 is IIRC glyph encoded in
octets and <ad> in hex. 0255 and 0xAD are both decimal 173, so both
refers to same glyph.
Now I see that in that PostScript file is also attached encoding vector
def /ENC0 [ ... ] and on position 173 is name /minus. And according to
Adobe /minus name represent Unicode code points U+2212.
So you are right it is not soft-hyphen, I forgot to see at encoding
vector in result PostScript file.
And also answer my question why ps2pdf converter from generates PDF file
where for switches are used U+2212 code points. ps2pdf did it correctly
by looking into attached encoding vector /ENC0.
So problem is for sure in grodvi which generates that PS file with
attached encoding vector. Unicode's hyphe-minus has code point U+002D
and according to Adobe's glyphlist.txt, U+002D is assigned to glyph name
/hyphen.
So man -Tps (or grodvi) can be fixed. Just it is needed to generate
correct encoding vector and use proper glyph name /hyphen for \- when
generating from manpage.
> > so it is incorrect for command line switch.
>
> It is not incorrect. The TR font does not contain a glyph for
> hyphen-minus, so plain minus is used as a fallback.
In font/devps/TR file is this line in "charset" section:
\- 564,286 0 173 minus
Should not this be number 45 instead of 173?
--
Pali Rohár
address@hidden
