[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lots of fatal build system bugs on OpenBSD

From: G. Branden Robinson
Subject: Re: lots of fatal build system bugs on OpenBSD
Date: Tue, 22 Mar 2022 19:59:10 +1100
User-agent: NeoMutt/20180716

Hi, Ingo!

At 2022-03-22T00:54:05+0100, Ingo Schwarze wrote:
> There are so many issues already now that it makes sense to get
> these fixed before even starting a review for cleanliness and
> before starting run time testing.

Yes.  I think nearly all of the issues arise from two deficiencies in my
knowledge: (1) inexperience with groff's build system, and (2)
inexperience with writing portable make(1).  GNU Make has always been
around for my entire development career and I have acquired some
parochial habits.  These are among the reasons I don't really want to be
groff's release manager.  They are remediable defects, but acquiring the
skills will take time.

Or I can learn the hard way.

> I'm sending this to you directly because you are a by far the
> most active groff developer right now, so i hope we can agree
> how to fix all this and then i can then push one or more patches
> doing it.  This really doesn't feel like the stuff to shove under
> the bugtracker carpet and let it rot there.

Let me reassure you that not all issues filed against groff in Savannah
die a death of neglect.  61 have been resolved this calendar year[3]; of
those, 51 were fixed[4] (as opposed to duplicates, invalid, and so on).

> Please be assured that i'm *not* claiming you are responsible for
> any of these regressions;

Most of them are, though!  I appreciate your diligence in identifying
them so that I can learn from my errors.

> So, let's go through the patch appended at the end.
> It's a minimal patch to make the software build.
>  1. File, .version.tree mechanism
>     There are several independent problems with this.
[big snip]

This one was definitely my fault; it arose from a Savannah ticket Bjarni
filed.  Long story short, I was unaware of the mechanism available in
build-aux/git-gen-version despite the comment pointing to it.  (I think
that comment could be improved.)

Your fix for this is equivalent to reverting c1bb33e5b, so that is how
we should dispose of it, right?  Savannah #61502 can be reopened and
re-resolved with instructions and a pointer to the `git-version-gen`

>  2. contrib/sboxes/, 1st chunk (SBOXES_PROCESSEDEXAMPLEFILES)
>     This one was hellish to figure out.  The problem is that it
>     results in an empty rule
>       contrib/sboxes/ contrib/sboxes/msboxes.pdf: yada yada
>         no shell commands following
>     whitout leading "./" whereas the later $(sboxes_builddir)/
>     target rule results in a rule
>       ./contrib/sboxes/ yada yada
>          some shell commands here
>     with a leading "./".  BSD make considers "contrib/sboxes/"
>     and "./contrib/sboxes/" different targets and does *not*
>     combine these rules, so when it comes to building the target,
>     it uses the empty shell script and vonsequently the file never
>     gets built.

That seems fragile to me...whether a (non-phony) target needs to be
updated should depend on the modification time of a file system object
it identifies, not the spelling of its name, right?

(Setting aside symbolic links, that is, which raise the problem of
whether the symlink itself or that which it points to should be
considered a target.  Maybe for Make's purposes there is no difference,
but many system calls have grown warty flags to annotate whether
symlinks should be followed or not.  In my opinion, symlinks are a
devil's bargain that are better replaced with bind mounts, per-process
or per-user views of the file system, or similar.  Similarly the whole
class of system calls openat() and friends reflect an unfortunately
early choice to pretend that a file system was a coherent, consistent
data store.  Now Linux has openat2(), to reflect the further late
realization that an int-sized bitfield called `flags` gives a hostage to
fortune, when you could store a pointer to a struct in the same amount
of space and have a system _library_ translate the "call" into an
appropriately configured set of true, primitive system calls with a
minimal number of memory loads.[6]  But I digress by wreathing myself in
the comforting flames of an operating systems design symposium.)

>     I believe the best fix is to always use $(sboxes_builddir)
>     for the generated files rather than switching back and forth.


>  3. contrib/sboxes/, 2nd chunk (msboxes.pdf)
>     The DOC_GROFF mechanism is seriously complicated, and one of
>     its many features is that it uses the variable $< which POSIX
>     says can only be used in suffix rules, so this fails outright
>     with POSIX make.
>     On top of that, the whole point of DOC_GROFF is to pipe
>     standard input into DOC_GROFF_ONLY which then calls groff.

For the sake of posterity and readers of the mailing list, one of the
points of DOC_SED, which DOC_GROFF calls (and DOC_GROFF_ONLY doesn't) is
to _alter_ the standard input stream, inserting `lf` requests
identifying the file being processed, so that when problems occur during
target generation, we don't get this:

troff:<standard input>:12968: error: foo was too bletcherous

but rather error: foo was too bletcherous

...which is _way_ more helpful.

However, see below.

>     So it makes no sense whatsoever to add a file name argument
>     (in this case, "$(sboxes_builddir)/") after
>     DOC_GROFF because that causes standard input to be discarded,
>     neutering most of what DOC_GROFF did earlier.
>     Fortunately, there is a simple way to fix both issues at the
>     same time: change the rule to become a suffix rule
>     and remove the bogus argument.
>     Alternatively, one could keep the argument, change DOC_GROFF
>     to DOC_GROFF_ONLY, and not switch to a suffix rule; i have
>     no strong preference.

I prefer the latter not just for the reason above, but because a suffix
rule is too general--I don't want it to cause problems later when we
add other PDF targets generated from ms sources.  Likely no other ms
document is going to need to load the sboxes macro package, at least not
for a little while.  More importantly, another ms document might have
other needs, like running the tbl preprocessor.  (Yes, we could have a
really general DOC_GROFF-style macro that runs all the preprocessors.  I
don't like that approach because I find it wasteful.)

I believe we have a pending Savannah ticket requesting that we ship
documents in PDF in addition to or instead of PostScript format.  It
seems a reasonable request in 2022--particularly if/as we add pdfmark

That said, I don't find the Automake variable _names_ very illuminating.
Once one realizes that they're prefixed with a scope to keep from
stepping on identifiers from other parts of the source tree, they make
more sense, but when I first encountered the term "DOC_GROFF" my brain
offered several parses in confusion.  The suffix "_ONLY" doesn't cover
itself in glory, either.

But see below.

>  4. doc/, doc/groff-man-pages.* rules:
>     Again, we have $< in non-suffix rules, this time not only
>     with one, but with many file name arguments.  Amazingly,
>     GNU make somehow copes.  Not that what GNU make does makes
>     any sense - it appears the $< picks the first dependency
>     and silently discards all the others, which is obviously
>     the perfect way to shove errors under the carpet - but at
>     least it doesn't crash like POSIX make does!  ;-)
>     Then again, all that resulted from $< is discarded anyway... :-(
>     So here, i think the best fix is to just be explicit and
>     switch from DOC_GROFF to DOC_GROFF_ONLY.

Yes, it turns out I didn't need DOC_GROFF here, because:

(1) groff is reading from file operands instead of the standard input
    stream.  To experiment, while composing this mail I deliberately
    damaged a man page and all of the ms(7) and me(7) documents in doc/,
    and troff issues diagnostics with a useful file name in every case;

(2) Every target for which we need sed to do actual work (replacing
    @VERSION@ and @g@) is now generated by a target rule that expands
    $(DOC_SED) explicitly.

In the future we can produce and from .in documents as
well, so that the site-configured command prefix is used, improving
documentation applicability and accuracy.

So, +1.

While I've got you here I wanted to clarify my make(1) understanding.

Here's POSIX[1].

>> $?
>>     The $? macro shall evaluate to the list of prerequisites that are
>>     newer than the current target. It shall be evaluated for both
>>     target and inference rules.
>>     For example, in a makefile target rule to build prog from
>>     file1.o, file2.o, and file3.o, and where prog is not out-of-date
>>     with respect to file1.o, but is out-of-date with respect to
>>     file2.o and file3.o, $? represents file2.o and file3.o.
>> $<
>>     In an inference rule, the $< macro shall evaluate to the filename
>>     whose existence allowed the inference rule to be chosen for the
>>     target. In the .DEFAULT rule, the $< macro shall evaluate to the
>>     current target name. The meaning of the $< macro shall be
>>     otherwise unspecified.
>>     For example, in the .c.a inference rule, $< represents the
>>     prerequisite .c file.

Since the standard says that $? "shall be evaluated for both target and
inference rules", and suffix rules are a variety of inference rule, I
should be able to use it in any rule's commands.

Is this reasoning sound?

>  5. doc/, doc/meintro.*, doc/ rules:
>     More $< GNUisms.
>     The best fix here seems to replace $< by the POSIX $?.

+1.  Yes.  This is another one that is my doing; I learned the $< sigil
a long time ago, poorly it seems (but accommodated by GNU
make(1)'s...generous interpretation thereof).

>  6. doc/, doc/ rule:
>     Yet another $< GNUism.
>     This is harder to fix because an "" suffix rule
>     lacking the -mfr already exists in doc/ and clashes.
>     But with the right "SUFFIXES +=" incantation to please automake,
>     we can get a "" work in a portable way.
>     Alternatively, we could switch to DOC_GROFF_ONLY instead;
>     again, no strong preference.

+1.  Rewriting the suffix rule works for me.  I admit it may seem
inconsistent with my response to item 3.  My feeble defense is that we
don't have any other me(7) documents in the tree, let alone any with
French translations.  If that changes, we'll need a target rule for this

In fact, in my own experiments, DOC_GROFF is no longer even necessary
once your changes and a few of my own are made.  So a subsequent commit
can simply rename DOC_GROFF_ONLY to DOC_GROFF.

I knew "DOC_GROFF_ONLY" was not my invention, so I looked it up[2].

>  7. doc/, manual SUFFIXES definition:
>     This is the only item on my list that isn't required to make
>     it build, but i had to do so much wrestling with the .SUFFIXES
>     pseudo-target anyway that i cleaned this up while there.
>     The .xhtml is not used anywhere, for nothing whatsoever.
>     All the others get automatically generated by automake,
>     so there is no need to manually add them.
>     In fact, maually adding them is detrimental in two ways:
>     The manual list is doomed to get outdated (.xhtml proves
>     the point) and it adds noise to the generated Makefile,
>     hindering debugging (which is already difficult enough in
>     such generated Makefiles without additional, pointless
>     distractions).

+1.  I'm fine with this.  I deny responsibility for the stray "xhtml"
suffix; it dates back several years.  I get to wear enough of the blame,
as 5 of the 6 fatal problems on OpenBSD are my doing.  :-O

> So, here are my questions:
>  I. Do you agree with all my points, or did i miss something,
>     somewhere?

I mostly agree.

>  II. When you apply my patch, do things still work for you on Linux?
>  III. Do you prefer that i commit this in one single commit
>       or separately?  If separately, do you want somme issues
>       grouped together, for example
>        * one commit for issue 1
>        * one commit for issue 2
>        * one commit for issue 3
>        * one commit for issues 4-6
>        * one commit for issue 7

Let's split them up; item 1 is best solved with a reversion; I'll do it.

>  IV. I'm pretty sure these are all recent regressions and none
>      of these problems existed in the last release.  Consequently,
>      do you agree that ChangeLog entries are not needed?

This may be our biggest difference of opinion.  As I understand it, the
"NEWS" file is for user-visible changes since the previous release, but
the ChangeLog is for the logging of code changes visible to developers.
Bug fixes applicable only to "unstable" versions are not to be omitted.

I'm willing to break down, changelog, and push these, with due credit to
you as `--author`, of course.  I have little else pending since I just
pushed yesterday and, as it happens, I wanted to see if I could get
access to a macOS box again and try a build there.

> P.S.
> Right now, i'm quite fed up because i had to do at least one bootstrap
> - configure - build cyle for each of the issues, and some of them
> weren't even identified until i did a complete bootstrap - configure -
> build - make-dist - port-patch - port-configure - port-built -
> port-fake-install cycle, and then i had to repeat all these cycles to
> develop and test fixes, and in between, some required quite some
> staring at Makefiles to understand.  At one point i even patched
> additional printf(3) statements into our BSD make(1) implementation
> for debugging and stepped through make(1) with gdb because at first i
> failed to understand from mere "make -d m" debugging output what was
> really going on.  Ugh!

That's terrible, and I am very sorry that my casual use of GNU Make
features created such a mess.

Not to diminish my culpability in this matter, but were I in your shoes
I'd be turning an eye toward improving BSD make(1)'s diagnostics and/or
debugging output.  You have an obvious traceability use case and the
tool did not adequately assist you.

To throw some shade at GNU Make as well, its manual does not seem to
document `$<`'s status as a land mine when used in suffix rules.  I
looked fairly hard at what seemed to be the relevant chapter[5], and
don't see any mention of it, nor in the next one, "Incompatibilities and
Missing Features".  That said, I am likely to have overlooked it even if
it were there because this manual is one of a few saddled with invariant
material (Front-Cover and Back-Cover Texts).  It is thus not DFSG-free,
an awkward property for GNU Project documentation.

> But i believe i'll get over it and do a full audit for cleanliness plus
> run-time testing on some other day, once these nasty issues are out of
> the way.
> It would no doubt be better if i tested more often and did not let
> so many issues accumulate.  :-/

I don't _ever_ push without verifying several soundness criteria.  I
always "distcheck".  I always eyeball any differences in man pages,, and me*.me in nroff mode, and for the man pages I further inspect
differences with continuous rendering (-rcR) on and off, and HTML output
as well.

Unfortunately, I'm running in a rather pedestrian Debian-based x86-64
environment with a GNU toolchain.  Step even a little bit off of that
path in any direction, and our assurance level is low.  (One day, I hope
to have a RISC-V-based development box--not as a virtual machine, but as
a daily driver.)

It sounds like one of the things I need to add to my push process is an
update of the .version file.

You have likely saved us, and me in particular, a big embarrassment with
1.23.0.rc2--thank you!

Best regards,

[6] I have drunk deeply of the microkernel Kool-Aid and it tastes good.

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]