[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] ASCII dash in UTF-8 locale

From: Tadziu Hoffmann
Subject: Re: [Groff] ASCII dash in UTF-8 locale
Date: Sat, 24 Jan 2015 01:41:08 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

> > Heirloom troff and groff both render \- as en dash,
> > not minus sign, in PDF output.

> If you use groff's native pdf driver (-Tpdf) I believe
> minus is rendered, can be searched for and copy/pasted.
> The postscript driver also outputs a "minus" so I suspect
> it is the ghostscript conversion to pdf which is changing it.

Here on my system, ghostscript keeps the minus when converting
to PDF.  The input file

  .sp 3c
  minus: \-
  en-dash: \(en

when processed by groff (using the default -Tps) and converted
to PDF using ghostscript results in the following page content
in the PDF:

  10 0 0 10 0 0 cm BT
  /R7 10 Tf
  1 0 0 1 72 744.851 Tm
  (minus: <AD>)Tj
  12 TL
  (en-dash: \211)'

where the <AD> is a single byte, matching groff's "text.enc"
that says minus is to be encoded at position 173.  The font
"R7" is a Times-Roman subset with the encoding

  /BaseEncoding /WinAnsiEncoding
  /Differences [ 137 /endash 173 /minus ]

Acroread (version 9) clearly renders the minus and the en-dash

When copied and pasted in a UTF-8 locale, it delivers them
as <e28892> and <e28093>, i.e., 'MINUS SIGN' (U+2212) and
'EN DASH' (U+2013).

In an ISO8859-1 locale both (like the hyphen) are pasted
as <2d>, i.e., "hyphen-minus".

If you want cut-and-pasteable ASCII command lines in PDF files,
I think the easiest way is to set up a hacked "code" font with
renamed glyphs.  Alternatively, you can try adding a
GlyphNames2Unicode dictionary.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]