groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] -mandoc alternative


From: Kristaps Džonsons
Subject: Re: [Groff] -mandoc alternative
Date: Thu, 16 Jul 2009 14:32:43 +0200
User-agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103)

Dear Werner et al.,

>> This mail is just to raise awareness of an alternative to "groff
>> -mandoc", mandoc (née mdocml) at <http://mdocml.bsd.lv>.
>> Disclaimer: I'm the project's lead.  mandoc is a BSD-licensed, C
>> implementation satisfying ONLY the BSD "-mdoc" manual format, and to
>> a limited extent, the traditional -man.  The system, as-is, handles
>> the majority of BSD manuals, and does so considerably faster than
>> groff.
> 
> If you restrict yourself to mdoc and man syntax this is a logical
> consequence...

I do, and this constitutes the basis of my arguments.  Note that some of
my arguments, especially that of groff's uncertainty, are not specific
to -mdoc/-man formatting.

> However, you write
> 
>   [groff] runs slowly, produces uncertain output, and varies in
>   operation from system to system.
> 
> Hmmm.  Please give more details how you come to this conclusion.

In terms of speed, groff loads tmac files (macros, character sets,
hyphenations, etc.), reads and parses input into IF by way of prototypes
(assuming -mdoc/man), sends IF to the output device, then renders the
output.  These all incur significant overhead.

mandoc, by contrast, is a standalone executable: parser libraries
(-mdoc, -man) linked statically to output libraries (-Tascii, etc.).
The parsers are ad hoc and table-driven, governed by an ontology based
on macro syntax.  The "IF" is a well-formed, regular AST.  Character
sets and so on are hard-coded.

This is equivalent to hard-coding the tmac structures and linking
together all groff components into a single binary.  Obviously, mandoc
can only do this as a result of its specificity.

By way of informal illustration, on a 2,5GHz machine, `nroff -mandoc'
takes 1m46s (3-pass mean) to render all OpenBSD manuals (to /dev/null);
mandoc takes 4s.

In terms of uncertain output, mdoc(7) and mdoc.samples(7) -- not even to
mention the melange of troff(1), groff(1), groff_char(7), etc. -- make
for an irregular, fragmented reference.  Consider:

   .Qq Hello, world.
   Hello again.
   Hello yet again.

Notice the varied sentential spacing.  Same goes with discarding
whitespace (yes in some macros line, no in free-form).  Consider also:
if, say, `Qq' is used at the line border versus `"', will it hyphenate?
 Or any macro?  How are line overruns handled in this case?  Do the
\*(xx escapes, or \(xx or \[xxx] or \*[xxx], produce equivalent output?
 What if one passes a title, volume, and architecture in `Dt'?  What
happens to  text on a `Pp' line?  These ambiguities motivate uncertainty.

All of these have answers, but the lack of reference causes uncertainty.
 The manuals bundled with mandoc are re-writes (or still being
re-written) of the above, with a specific eye toward compositional
regularity.

In terms of variegated output, OpenBSD and Linux, for example, render
`Nd' macros with an En dash (the former, until recently, was just an
escaped minus sign), while NetBSD uses an Em dash.  The set of available
macros is non-uniform (Lk?  Mt?).  The available special character set
differs.  The set of installed manuals differs.  Some systems rendered
`Pa' with an underline; some don't.  Macro default widths vary widely
(see `Er').

groff benefits from generality, where -mdoc and -man are only macro
prototypes, and on the liberties of customisation.  mandoc offers
neither of these (well, limited liberty) and is thus able to operate
much more efficiently and concisely within its specific domain.

This all disregards my biggest problem with groff in the sense of -mdoc:
given -Thtml, how can I embolden only variable types?  Since the groff
IF abstracts presentation, and necessarily so, this is impossible wlog.
 mandoc interprets -mdoc as a specific, semantic language, reflected in
the parsers' ASTs.

Hope this clears things up,

Kristaps




reply via email to

[Prev in Thread] Current Thread [Next in Thread]