[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] The future redux

From: Eric S. Raymond
Subject: Re: [Groff] The future redux
Date: Tue, 25 Feb 2014 11:06:09 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

Peter Schaffter <address@hidden>:
> I'm afraid this will be a long post.  Sorry, but I don't see any way
> around it.

I found this a very worthwhile read.  You raised deep issues that required
thought and development.  In this reply I will offer some responses that
I hope are as substantive.

Some weeks ago explained I think exceptional coding skills are not
necessarily required in a lead for a project like groff.  I said: "You
can lead a dev team by having good judgment and good taste and good
timing about what should be done, and showing that through good
communications skills."

I think you are demonstrating those qualities - and that, whether you
so intended it or not, this post was a strong bid for you to be
groff's next design lead.
> * Backward Compatibility

I gave your essay a first read,  then went off to make breakfast - an
important beginning-of-day ritual for me during which I get my start 
on thinking about the day's problems.  While I was eating my eggs 
and bacon I had a large insight...

Backward compatibility is a red herring.  More precisely, it is not
the presence of presentation-level requests from the year zero that
makes groff-as-it-is unfit to play in the semantic-markup world, it is
the fact that macro packages presently *cannot disable access to the
lower level*.

Man markup is good to think about in this context because (a) it is
groff's most important application by document volume, and (b) it is
one where fine details of what you call "expressive" typography have
zero or vanishingly small importance.  This is so simply because 
of the typographic poverty of the viewer-contexts through which
man pages are normally rendered these days.

Right now, man markup is not really a controlled vocabulary.  Man-page
authors can, and sometimes do, write low-level requests that break
man-markup's semantic model.  This complicates life tremendously for
non-groff renderers and translators, including XMan and man2html and

Now let us imaging adding two primitives to groff:

1. Declare hygienic.  Takes a request or macro name, sets a 'hygienic'
bit on it.

2. Enable hygienic node.  After this point, all explicit requests without
their hygienic bit set are disabled and cause a fatal error.  They
can only be used within hygienic macro expansions.

Given this pair of primitives, backward compatibility and the goal of
achieving semantic markup in groff would no longer be in conflict.
Instead, macro packages get to choose where they sit on the
structured-vs.-expressive continuum by what set of requests they

At one end of the continuum, the man macros would disable all but a
dozen or so macros and a handful of relatively tractable low-level
requests (mostly font-change escapes).  Rendering this restricted set
to decent HTML would be trivial.  There would be some bitching from a
small percentage of man page authors who would have to clean up their
markup, but from working on doclifter I know exactly where to set the
bar so those compaints would be less than 1% by volume.

At the other end of the continuum would be full, old-fashioned groff
with no hygiene. People like Mike Bianchi would feel at home there.

Within the project, the importance of this one bit of mechanism is 
thay it would allow us to sidestep a lot of thorny policy debates. In
effect, groff would bifurcate - one class of its markups freed to become
tighter and more structural, the other cultivating full presentation-
level expressiveness.

As to groff's typesetting model:

>              For example, groff's line-at-a-time approach to
> formatting, if unchanged, will remain an impediment to high quality
> typesetting and ensure groff's demise for anything other than
> writing manpages.

You speak truth.  And here's how it bites: man pages don't really 
need expressive typography.  If that ends up being groff's only
application, it'll be moribund.

>                    Since the point of implementing page-at-once
> formatting (or, as Werner dreamed, document-at-once) would be
> to improve the quality of typeset output, not to change the
> fundamentals of groff usage, resisting such a change seems like
> misplaced Luddism.

Now I'm going to express an opinion which will doubtless make me
unpopular, but here it is: pure semantic markup, possibly augmented
with a stylesheet, *is* the dream of document-at-a-time formatting.

I am doubtful groff can ever become this sort of engine without 
ceasing to be recognizable as groff, but I won't be unhappy if
you prove me wrong - and later in your essay you sketch a 
path in that direction.

> * Groffers love good typography

So do I, but it is pretty much irrelevant to any use *I* ever make of
groff. So I'm going to forgo commenting on this.

> * The great presentational vs semantic markup debate

On this, unsurprisingly, I have opinions. :-)

> Eric says: 
>  "What I don't believe is that there will ever again be enough
>   demand for printer-*only* output to justify markup formats
>   and toolchains that don't also do web and ePub or functional
>   equivalent."
> In this he may well be right, but he is speaking of a world where
> precise control over typography no longer plays the role it
> does presently in document design.

I don't think your conclusion follows.  That is, I don't see any reason
why a combination of stylesheets with in-document processing instructions
to declare local exceptions could not be fully expressive in your terms.
(I speak as a person who has produced good-looking books in XML-DocBook.)

In actual fact the stylesheet-based engines are not quite that good yet.
But getting them there won't require any conceptial breakthroughs. If
groff is ever rendered entirely obsolete, this is how it will happen. And
groff can only compete by having developers realistically aware of this.

> I think what happened is that, over time, near-exclusive use of the
> PostScript driver caused many of us to confuse groff output with
> grops output--if not intellectually, at least at the conceptual
> level.  We began to think of groff as a PostScript typesetting
> engine.

I think this is a very shrewd observation.

> I suspect the uneasiness with what Eric has to say about groff's
> future, and the whole semantic-vs-presentational debate, stems from
> Eric addressing the issue in terms of groff's original mandate

Yes.  Savor the irony that my adherence to Unix philosophy
about device-independence is what draws me past groff to XML and
asciidoc.  My problem with the presentationalism around here
isn't that you guys are old-school, it's that you're not
old-school *enough*!

> We cannot ignore the need for groff to accommodate the Web and
> ePub-y type things, despite its paper-centricity.  Eric is quite
> right that "printer-only" will never again be enough.  However, as
> he also points out, attempting to extrapolate semantic meaning from
> groff output is impossible "... because the information required to
> do that is thrown aware at macro expansion time."  The conclusion
> he draws from this strikes me as self-evident: "The difficult but
> correct thing to do is to recover structural information by looking
> for cliches in the source markup *before* it goes through troff."

In case it is lost on anyone, this is exactly the design bet I made
when I wrote doclifter.  Which does its job pretty well. 

> But why "difficult"?  Well, mostly owing to historical groff
> (mis?)use, which fostered conditions where, in Mike's words,
> "...presentation and formating are horribly intermingled."

This is why I think my hygienic-mode proposal is important. It
gves us the option to put a hard stop to that nonsense.

> The trick, of course, is writing well-formed source files, with
> clear a distinction between metadata, stylesheet, semantic tags,
> and discardable presentational markup.

Preach it, brother!  If this can be done, it can save groff. And if
you are declaring that as the guiding principle for your tenure
as the project lead, I'll support it all the way.
                <a href="";>Eric S. Raymond</a>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]