[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mdoc(7) prologue regressions

From: G. Branden Robinson
Subject: Re: mdoc(7) prologue regressions
Date: Sat, 16 Jul 2022 02:21:49 -0500

Hi Ingo,

At 2022-06-27T00:29:08+0200, Ingo Schwarze wrote:
> The first issue i identified is a group of regressions in the
> behaviour of the mdoc(7) prologue macros .Dt and .Os.
> The regressions aren't particularly severe because all that i found
> so far only trigger when the document uses these macros incorrectly.
> All the same, i'd like to report them such that we can decide
> whether we want to fix some or all of them.
> I suspect that this commit might be responsible but admit
> that i did not prove this suspicion by testing right before
> and right after the commit.  I only tested that the behaviour
> changed as described below from groff-1.22.4 to groff-current:
>   commit a1e6c19176d38823d8dc6c9a619a493ca90bdca4
>   Author: G. Branden Robinson <>
>   Date:   Sun Oct 3 23:15:12 2021 +1100
>   [andoc,man,mdoc]: Fix Savannah #61266.
>   Resolve problems in batch rendering of man pages to PDF arising from
>   entanglement of end-of-input traps, page location traps, continuous
>   rendering mode, and andoc's reloading of the (m)an and (m)doc packages.
>   [...]

That commit was an immense pain to get "right".  As I feared, my words
from later in the commit message have come back to haunt me.

 Refactoring is needed: some macros and registers have misleading names,
 there is some code duplication in mdoc, and some of the trap management
 problems are solved in slightly different ways in man(7) and mdoc(7),
 perhaps unnecessarily.  We also need some test scripts to protect us
 from regressions.  But this fixes the rendering problems.

I didn't do the regression tests.  But it probably would not have
occurred to me at the time to test the incorrect usage modes of the
mdoc(7) macros.

For all of these issues, I have the same pair of questions: is that a
regression or just a difference?  Is there a specification for this

It may not be necessary to answer them, however.

>  1. When there are two .Dt macros in the prologue, the last one used
>     to win, setting the page title, section number, and section title.
>     Now, the first one wins, setting these fields.
>  2. When a .Dt macro occurs in the body of the page (as opposed to
>     in the prologue), it used to be ignored.  Now, it causes a
>     large number of blank lines in the output.
> Both issue 1 and issue 2 can be seen with this test file:
>  3. When the first .Dt macro comes late, the page title used to be
>     set to "UNTITLED".  Now, it is set to the empty string.
> Both issue 2 and issue 3 can be seen with this test file:
>  4. If there is no .Dt macro at all, the page title used to be
>     set to "UNTITLED".  Now, it is set to the empty string, see:
>  5. When the usual order of .Dt and .Os is exchanged,
>     the .Dt macro is now completely ignored, setting the page title
>     to the empty string and the section title to "LOCAL", see
>  6. The same regression as for issue 5 occurs when there are two .Os
>     macros in the order .Dd .Os .Dt .Os, see
>  7. When the .Os macro comes late - i.e. in the body of the page
>     rather than at the usual place in the prologue -
>     the header line now appears at that place in the middle of the
>     body and no longer at the top of the manual page where it belongs, see
>  8. When the .Os macro is completely missing, the header line is no
>     loger printed at all, see
> The most severe issue is probably number 8 because forgetting the .Os
> macro, or thinking it might be optional, might even happen in real-world
> manual pages.  The next most severe would then be issue 5 because
> mixing up the order might also happen in practice.
> Number 7 is also somewhat unfortunate.  While not quite as likely to
> happen as putting the .Os macro at the wrong place *inside* the prologue,
> the effect produced is very ugly.  Similarly, issue 2 is unlikely
> to occur in practice, but the effect is also very ugly.
> The remaining issues 1, 3, 4, and 6 are less severe.  But in case we
> decide that some of the more severe regressions need fixing, maybe
> properly fixing them all might not cause that much extra work?
> In any case, i thought listing them all might potentially be useful.

I'll have a fresh look at my changes to groff mdoc in this commit and
see if I can find good spots to recover more gracefully.

I have to admit I'm tempted to either let these fail, or, if I can find
a good place to stick a sanity-checking hook, simply refuse to render
the page if these macros, documented as mandatory, are missing.  But,
maybe replacing the missing content with shouty-caps stuff like
"UNTITLED" and "LOCAL" suffices to clue the user in.  Perhaps also
changing the fallback operating system string from "BSD" to "GNU" will
more effectively agitate misusers of mdoc(7) into correcting their ways.

(Relatedly, I don't understand why anyone thought it was a good idea for
the volume titles for mdoc(7) man pages in a GNU project to all announce
themselves as being from a BSD manual even if they're rendering man
pages that have nothing to do with BSD.  This name is _not_ derived from
any argument to the `Os` macro, nor configured based on the build host's
OS identity, but hard-coded in `doc-volume-operating-system`.  If this
string were made empty or deleted, the volume titles would exactly match
those used by groff man(7).)

Anyway, handling the repeated cases seems like it should be easier, by
testing a flag register in each of .Dt, .Dd, and .Os, and clearing that
register again as part of the end-of-input macro.

groff man(7) in groff Git behaves pretty badly if `TH` is omitted,
whereas groff 1.22.4 degrades much more gracefully.

I am however loath to give up the (in my opinion) immense improvements
I've managed to hammer into place for the rendering of multiple man(7)
and mdoc(7) documents with one groff command.  They have made possible
the (again, in my opinion) attractive result that can be seen in the
shipping groff-man-pages.pdf document.

Here's the rule that produces groff-man-pages.pdf.

doc/groff-man-pages.pdf: $(GROFF_MAN_PAGES_ALL) eqn pic tbl
        $(GROFF_V)$(DOC_GROFF) -pet -Tpdf -P-e -mandoc -rC1 \
          -rCHECKSTYLE=3 $(GROFF_MAN_PAGES1) \
          $(tmac_srcdir)/sv.tmac $(GROFF_MAN_PAGES2) \
          $(tmac_srcdir)/en.tmac $(GROFF_MAN_PAGES3) > $@

If you attempt anything like the foregoing with groff 1.22.4 or earlier,
the result will be most unpleasant.  Long story short, batch rendering
is not compatible with rendering any man(7) pages after any mdoc(7)
page.  This seems eerily consistent with the views of mdoc(7)
partisans--once you've read a page in its language, why would you ever
go back to man(7)?  ;-P

I further note that the new `MR` macro, which last month came in for a
renewed round of derogation on this list as unworthwhile and excessively
novel[1], enables the local man page hyperlinks you see at Deri's
site[2] to be achieved in any collection of man(7) and mdoc(7) documents
without an external database.  (For the time being, at that site,
they're even in Ingo's preferred typeface. ;-) [3])

But, if there's a reasonably clean way to get both good batch rendering
and graceful degradation of defective documents, I'm amenable.


[1] despite much prior art, ranging from Ultrix's man's `MS` to mdoc's
    `Xr` to, of course, plan9port's `MR`
[3] Your preference will be replaced with the user's (via the `MF`
    string) once I get integrated Deri's further proof-of-concept that
    he sent me about three weeks ago.

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]