groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] ASCII Minus Sign in man Pages


From: Ingo Schwarze
Subject: Re: [Groff] ASCII Minus Sign in man Pages
Date: Thu, 4 May 2017 05:27:32 +0200
User-agent: Mutt/1.6.2 (2016-07-01)

Hi James,

James K. Lowden wrote on Wed, May 03, 2017 at 08:13:18PM -0400:

> IIUC, this debate about how to render - and \- stems from a conflict in
> historical practice.  Is the following correct?  
> 
>       When troff was young, terminals were ascii and the - character
> was 0x2d.  Manpage guidelines encouraged the use of \- for flags because
> they rendered nicely in printed documents with no harm done to nroff
> output.  They did that despite the obvious fact that the manpage is
> there to describe what to type, and basically no one can type the
> denoted character.  
> 
>       Then Unicode pronounced that 0x2d was neither fish nor fowl,
> and gave us hyphen, minus, and endash characters.  groff dutifully
> mapped - onto hyphen \- onto minus.  But when terminals gained Unicode
> capability, some of them lost cut-and-paste convenience.

So far, that is maybe somewhat simplified, but more or less to the
point.  For details of early runoff/roff history, see

  http://manpages.bsd.lv/history.html

Basically, you are starting your narrative in 1973.  At that point,
the language was about nine years old and had seen at least ten
earlier implementations in about eight different programming languages
on about five different operating systems on about eight different
machines by at least ten different authors.  Finding out how all
those handled "-" when they were young might be non-trivial.

> The debate is over how to recover that convenience.  

No, it is not.  That was solved long ago, at the latest here:

  commit 98acc924f4e32cfc2209df5db0c21921df8cc7ac
  Author: Werner LEMBERG <address@hidden>
  Date:   Fri Jan 2 23:16:20 2009 +0000

    * tmac/an-old.tmac, tmac/doc.tmac: For -Tutf8, map \-, -, ', and `
    conservatively to ASCII for the sake of easy cut and paste.

The debate is over three different topics:

 1. Cut and paste from -Tps, -Tpdf, and -Thtml.

 2. What to use if ASCII HYPHEN-MINUS is desired in the output,
    both in manual pages and in other documents.

 3. What to use if a mathematical minus sign is desired in the
    output.

> Oddly, my system doesn't exibit any cut-and-paste anomaly despite
> using xterm with the "-en UTF-8" option.  Searching for - in less
> also works.

Yes, due to Werner's change in 2009 quoted above.
One of the effects is that in manual pages, "-" and "\-"
in the input always render as U+002D HYPHEN-MINUS in -Tutf8.

> If it's a UI issue we're confronting, perhaps it's really up to the UI
> to deal with.  The man utility can certainly impose on nroff the
> requirement that - and \- both render as 0x2d.  Then it shows up
> correctly in the pager.  It is visually acceptable to the user, and
> DTRT regarding the UI.  (Maybe that's what Ubuntu LTS does for me; I
> don't know.) 

It's not Ubuntu, it's groff itself already doing that for you.

> It's not obvious to me groff should make any change at all.  At most,
> reverting the mapping of - so that it outputs 0x2d again would undo a
> nonobvious, subtle change in favor of simplicity.  

Probably not, because that would break each and every existing
non-manpage roff document.  Besides, i just noticed that it's
completely unclear what "output U+002D HYPHEN-MINUS in a PostScript
or PDF document" is even supposed to mean, see my other mail...

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]