[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Groff] Manpages, groff, and the browser.

From: Kristaps Dzonsons
Subject: [Groff] Manpages, groff, and the browser.
Date: Sun, 16 Mar 2014 23:48:48 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.3.0

Hi folks,

In the last few weeks, there's been some confusing mention of manpages on this list. Confusing because some of the issues raised, in my eyes, aren't really issues at all. So I thought I'd pipe up in the hopeful interests of clarity.

To begin on familiar ground, by manpages I mean man(7), which simplifies manpages in the style of the original UNIX Programmer's Manual; and mdoc(7), written for a similar purpose but with hindsight of man(7)'s ambiguity, such as how to format variable names or structure the manpage header. mdoc(7) is necessarily more complicated than man(7), but it significantly relieves authors of stylistic improvisation. Then there's -Tascii for man(1) and -Tps for dead trees, etc.

(By the way, I was a very small boy when mdoc(7) was written and had nothing to do with it. Maybe somebody has a handle on one of the original authors and can corroborate its origins? We already have maybe the same but for macro packages?)

The confusion in these list threads, as I see it, begins when browsers are brought into the classical mix of man(7), mdoc(7), -Tps, and -Tascii. What's also confusing is semantics and the web in general.

Browsers are confusing because HTML doesn't play with character-driven media. And roff(7), into which groff(1) translates man(7) and mdoc(7), is (significantly?) character-driven. We hack around this by converting -Tascii output into <pre>-wrapped documents. But that's not really HTML and makes browsers cry.

One solution is to disregard roff(7) and regard only man(7) and mdoc(7). mandoc(1) does this. It gets away with it because it's built specifically (and in a way, dumbly) just for man(7) and mdoc(7) and just enough roff(7), tbl(7), etc. groff(1) is far broader in scope, and consumes roff(7) as a whole. So it can't exploit this simple trick.

It was suggested that groff(1) be taught a subset of roff(7) that can map into a tree structure, then compile that further into HTML. If this is possible (it sounds hard and/or awesome), and if somebody pulls it off and modifies the existing macros to use the "clean" roff(7), then groff(1) would map beautifully into HTML and not care whether its input is mom(7), mdoc(7), or man(7) so long as the underlying tmac file has been properly treated. That's a lot of work: identifying the relevant roff(7) macros, then teaching groff(1) to extract a syntax tree from those macros, then doing something with that syntax tree, then modifying the macro packages. But it sounds, to my uninformed ear, possible.

Unfortunately, that's only half of the confusion. The other half is "semantics".

Even if groff(1) could do as above, and somehow carry over the original macro language's "meaning", it'd be only as good as its input language. To wit, Eric proposed extending man(7) with semantics to address exactly that. And that would give us... another mdoc(7).

While I agree that mdoc(7) is no semantic saint--sometimes it goes too far, sometimes not far enough--it exists right now, has considerable support and inertia, many eyes on macros and renderings, and has demonstrable proof of capability. mandocdb(8), via mandoc(3), dumps manpages' semantic content into Berkeley or SQLite databases. (Ingo, who's captaining mandoc, can speak better on its status, as well as -Thtml and friends.)

And how exactly would groff(1) profit from a new macro language? At the very least, it'd require a whole new macro package to maintain. And groff(1) still wouldn't be able to understand semantics without "clean" roff(7) and considerable work on internals.

And how would the community as a whole benefit? As a language, the new man(7) wouldn't be much different from mdoc(7). And then there's balkanisation: we already have two language for manpages. You're proposing another?

If semantics and browsers are the future of manpages, then we already have real, working solutions. We have mdoc(7). And there's at least a credible plan on modifying groff(1) to support a clean roff(7) which could be used by both man(7) and mdoc(7). mandoc(1) can already do this: you can hook into mandoc(3) today and see for yourself.

So in short, why not throw more weight behind mdoc(7) instead of reinventing the wheel?

If groff(1) gains "clean" roff(7) capabilities, it could hook into mdoc(7) and man(7) as they live today. There's no need for yet another language--we already have one that works for many users. And if we find issues, we can collectively consider how to grow it with the knowledge of thousands of existing mdoc(7) pages, and the good folks in the BSD systems who work with them, and their -Thtml output, on a daily basis. Everybody wins.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]