[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lots of fatal build system bugs on OpenBSD

From: Ingo Schwarze
Subject: Re: lots of fatal build system bugs on OpenBSD
Date: Tue, 22 Mar 2022 18:07:41 +0100

Hi Branden,

G. Branden Robinson wrote on Tue, Mar 22, 2022 at 07:59:10PM +1100:
> At 2022-03-22T00:54:05+0100, Ingo Schwarze wrote:

>> There are so many issues already now that it makes sense to get
>> these fixed before even starting a review for cleanliness and
>> before starting run time testing.

> Yes.  I think nearly all of the issues arise from two deficiencies in my
> knowledge: (1) inexperience with groff's build system, and (2)
> inexperience with writing portable make(1).  GNU Make has always been
> around for my entire development career and I have acquired some
> parochial habits.  These are among the reasons I don't really want to be
> groff's release manager.

That sounds entirely understandable to me.
Fortunately, checking a few Makefile patches proved easier than
being a release manager.  Thanks for those checks & feeback!-)

>> I'm sending this to you directly because you are a by far the
>> most active groff developer right now, so i hope we can agree
>> how to fix all this and then i can then push one or more patches
>> doing it.  This really doesn't feel like the stuff to shove under
>> the bugtracker carpet and let it rot there.

> Let me reassure you that not all issues filed against groff in Savannah
> die a death of neglect.  61 have been resolved this calendar year[3]; of
> those, 51 were fixed[4] (as opposed to duplicates, invalid, and so on).

That is indeed great and highly appreciated; yet there is such
an inflation of irrelevant tickets that some relevant ones rot even
though Dave and you are doing very good work on the tracker.
I think that can't be helped with the current number of developers.

>>  2. contrib/sboxes/, 1st chunk (SBOXES_PROCESSEDEXAMPLEFILES)
>>     This one was hellish to figure out.  The problem is that it
>>     results in an empty rule
>>       contrib/sboxes/ contrib/sboxes/msboxes.pdf: yada yada
>>         no shell commands following
>>     whitout leading "./" whereas the later $(sboxes_builddir)/
>>     target rule results in a rule
>>       ./contrib/sboxes/ yada yada
>>          some shell commands here
>>     with a leading "./".  BSD make considers "contrib/sboxes/"
>>     and "./contrib/sboxes/" different targets and does *not*
>>     combine these rules, so when it comes to building the target,
>>     it uses the empty shell script and vonsequently the file never
>>     gets built.

> That seems fragile to me...whether a (non-phony) target needs to be
> updated should depend on the modification time of a file system object
> it identifies, not the spelling of its name, right?

Yes, i admit there is some fragility in OpenBSD make in this respect.

Then again, the rule "targets are strings" is much simpler than
the rule "targets are inodes", in particular since it is common
to define targets that are not file system objects at all.

So while our OpenBSD make behavoiur caused fragility and surprise
here, i suspect GNU make behaviour just causes different complexity,
fragility, and surprises.  In any case, thinking of targets as strings
and not hoping that make(1) will magically recognize strings pointing
to the same file system object is better for portability.

Also consider that it isn't unusual for a make(1) implementations
to finish constructing the input graph before they even start to
investigate which targets are needed for the current run and to
consider whether each of those is up to date or outdated, and
there is no real need to inspect the file system before that late
point in time.  Finally, to realize that two different strings may
end up pointing to the same file is not even possible by inspecting
the file if one or both of the names do not yet exist in the file
system, which is obviously a typical situation for make(1).

>>     So it makes no sense whatsoever to add a file name argument
>>     (in this case, "$(sboxes_builddir)/") after
>>     DOC_GROFF because that causes standard input to be discarded,
>>     neutering most of what DOC_GROFF did earlier.
>>     Fortunately, there is a simple way to fix both issues at the
>>     same time: change the rule to become a suffix rule
>>     and remove the bogus argument.
>>     Alternatively, one could keep the argument, change DOC_GROFF
>>     to DOC_GROFF_ONLY, and not switch to a suffix rule; i have
>>     no strong preference.

> I prefer the latter not just for the reason above, but because a suffix
> rule is too general--I don't want it to cause problems later when we
> add other PDF targets generated from ms sources.  Likely no other ms
> document is going to need to load the sboxes macro package, at least not
> for a little while.  More importantly, another ms document might have
> other needs, like running the tbl preprocessor.

Fair points all around, so i pushed a DOC_GROFF -> DOC_GROFF_ONLY
instead for this rule.

> I believe we have a pending Savannah ticket requesting that we ship
> documents in PDF in addition to or instead of PostScript format.  It
> seems a reasonable request in 2022--particularly if/as we add pdfmark
> support.

That sounds reasonable to me, too.

> That said, I don't find the Automake variable _names_ very illuminating.
> Once one realizes that they're prefixed with a scope to keep from
> stepping on identifiers from other parts of the source tree, they make
> more sense, but when I first encountered the term "DOC_GROFF" my brain
> offered several parses in confusion.  The suffix "_ONLY" doesn't cover
> itself in glory, either.

Indeed, those two are poorly named.  Then again, both refer to
facilities that are *not* UNIX-style building boxes doing one small,
clearly delimited task that allows a clear and concise naming.  Instead,
both are complicated and have a lot of open, dangling ends (like
providing some hard-coded options but almost never all that are needed,
like involving very non-obvious hard-coded pipes ...) and seeme to
be designed from what happened to be, more or less by chance, common
parts of some tasks that arose in a few dozens of rules.  Such eclectic
attempts at factoring out common aspects of some tasks are notoriously
hard to name - and it is notoriously hard to remember what exactly they
do, so mistakes while using them are to be expected.  I claim the poor
naming is a symptom of them being poorly designed in the first place.

>>  4. doc/, doc/groff-man-pages.* rules:
>>     Again, we have $< in non-suffix rules, this time not only
>>     with one, but with many file name arguments.  Amazingly,
>>     GNU make somehow copes.  Not that what GNU make does makes
>>     any sense - it appears the $< picks the first dependency
>>     and silently discards all the others, which is obviously
>>     the perfect way to shove errors under the carpet - but at
>>     least it doesn't crash like POSIX make does!  ;-)
>>     Then again, all that resulted from $< is discarded anyway... :-(
>>     So here, i think the best fix is to just be explicit and
>>     switch from DOC_GROFF to DOC_GROFF_ONLY.

> Yes, it turns out I didn't need DOC_GROFF here, because:
> (1) groff is reading from file operands instead of the standard input
>     stream.  To experiment, while composing this mail I deliberately
>     damaged a man page and all of the ms(7) and me(7) documents in doc/,
>     and troff issues diagnostics with a useful file name in every case;
>     and
> (2) Every target for which we need sed to do actual work (replacing
>     @VERSION@ and @g@) is now generated by a target rule that expands
>     $(DOC_SED) explicitly.

Thanks for those meaningful tests, they further bolster my confidence
that the changes i pushed are about right.

> While I've got you here I wanted to clarify my make(1) understanding.
> Here's POSIX[1].
>>> $?
>>>     The $? macro shall evaluate to the list of prerequisites that are
>>>     newer than the current target. It shall be evaluated for both
>>>     target and inference rules.
>>>     For example, in a makefile target rule to build prog from
>>>     file1.o, file2.o, and file3.o, and where prog is not out-of-date
>>>     with respect to file1.o, but is out-of-date with respect to
>>>     file2.o and file3.o, $? represents file2.o and file3.o.
>>> $<
>>>     In an inference rule, the $< macro shall evaluate to the filename
>>>     whose existence allowed the inference rule to be chosen for the
>>>     target. In the .DEFAULT rule, the $< macro shall evaluate to the
>>>     current target name. The meaning of the $< macro shall be
>>>     otherwise unspecified.
>>>     For example, in the .c.a inference rule, $< represents the
>>>     prerequisite .c file.

> Since the standard says that $? "shall be evaluated for both target and
> inference rules", and suffix rules are a variety of inference rule, I
> should be able to use it in any rule's commands.
> Is this reasoning sound?


But beware that $? can have surpising effects in rules that contain
more then one prerequisite because the shell commands executed will
then vary according to which prerequisite(s) are out of date.
That's not a problem in the case at hand because when there is
exactly one prerequisite, the shell code will get executed if and
only if that single prerequisite is outdated, so $? will always
expand to the same string for such a rule.

> In fact, in my own experiments, DOC_GROFF is no longer even necessary
> once your changes and a few of my own are made.  So a subsequent commit
> can simply rename DOC_GROFF_ONLY to DOC_GROFF.

Yes, getting rid of the execessively complicated DOC_GROFF facility
would be nice.  A Makefile is not the best venue for holding a sed(1)
and sh(1) golfing contest.

> I knew "DOC_GROFF_ONLY" was not my invention, so I looked it up[2].
> [2]\
>             ?id=194b0d4d6a56a114655426c60c69126cef01ed01

Heh.  Uh oh.  Fixing bugs and improving the design at the same
time always feels best, but sometimes, one fails to achieve
that and instead accumulates even more technical debt.

> Let's split them up;

Done.  They are all in now, one of them in the form you preferred.

> item 1 is best solved with a reversion; I'll do it.

Thanks, that one looks good.

>>  IV. I'm pretty sure these are all recent regressions and none
>>      of these problems existed in the last release.  Consequently,
>>      do you agree that ChangeLog entries are not needed?

> This may be our biggest difference of opinion.  As I understand it, the
> "NEWS" file is for user-visible changes since the previous release, but
> the ChangeLog is for the logging of code changes visible to developers.
> Bug fixes applicable only to "unstable" versions are not to be omitted.

Makes sense, so i included ChangeLog entries when committing.

> That's terrible, and I am very sorry that my casual use of GNU Make
> features created such a mess.

Don't worry, if you weren't doing all the great work you are doing,
the damage would be much worse that having to do one day of debugging
and bugfixing once in a while.

> Not to diminish my culpability in this matter, but were I in your shoes
> I'd be turning an eye toward improving BSD make(1)'s diagnostics and/or
> debugging output.  You have an obvious traceability use case and the
> tool did not adequately assist you.

Well, frankly, the -d m debugging output did report that make(1) used
contrib/sboxes/ at one place and ./contrib/sboxes/
at another, but even looking at the output (which is necessarily very
large for such a large Makefile), i failed to spot the difference,
and that sent me down the rabit hole into printf(3) and gdb(1) until
i suddenly realized "oh the reason these two input graph nodes remain
separate rather than being combined is that the bloody path names are
simply *different* from each other.  Duh!"

But yes, you are right that the format and meaning of -d m output
could be much better documented.

> To throw some shade at GNU Make as well, its manual does not seem to
> document `$<`'s status as a land mine when used in suffix rules.  I
> looked fairly hard at what seemed to be the relevant chapter[5], and
> don't see any mention of it, nor in the next one, "Incompatibilities and
> Missing Features".  That said, I am likely to have overlooked it even if
> it were there because this manual is one of a few saddled with invariant
> material (Front-Cover and Back-Cover Texts).  It is thus not DFSG-free,
> an awkward property for GNU Project documentation.

Sure, documenting portability considerations is among the more difficult
tasks in documentation.  In general, in particular when the number of
extensions is small, i prefer first documenting all that the tool provides
(not digressing into portability), then listing the features that are
extensions below STANDARDS.  But for a tool like make(1) where more than
half of the features are extensions, and all the same portability is
paramount, that isn't the right way.  Consequently, our make(1) manual
is riddled throughout with sentences like these:

     The handling of ‘.depend’ is a BSD extension.

     There are seven different types of lines in a makefile: dependency
     lines, shell commands, variable assignments, include statements,
     conditional directives, for loops, and comments.  Of these, include
     statements, conditional directives and for loops are extensions.

     The : operator is the only standard operator.  The :: operator is
     a fairly standard extension, popularized by imake.  The ! operator
     is a BSD extension.

     As an extension, targets and prerequisites may contain the shell
     wildcard expressions ‘?’, ‘*’, ‘[]’ and ‘{}’.

     +=      Append the value to the current value of the variable

     The flags [-BCDdIjmV] are extensions to that specification.

And so on, there are many more sentences in that vein.

>> But i believe i'll get over it and do a full audit for cleanliness plus
>> run-time testing on some other day, once these nasty issues are out of
>> the way.
>> It would no doubt be better if i tested more often and did not let
>> so many issues accumulate.  :-/

> I don't _ever_ push without verifying several soundness criteria.  I
> always "distcheck".  I always eyeball any differences in man pages,
>, and me*.me in nroff mode, and for the man pages I further inspect
> differences with continuous rendering (-rcR) on and off, and HTML output
> as well.

That sounds very diligent, thanks for all that.

I expect most of what i noticed from the corner of my eye is probably
related to either portability or OpenBSD specifics.

> You have likely saved us, and me in particular, a big embarrassment with
> 1.23.0.rc2--thank you!

I guess shortly before we push out rc2, i also ought to test on SUN
Solaris 10 and Oracle Solaris 11.  I doubt that i'll do that at short
intervals, though, as it causes more work than testing on OpenBSD,
even if all goes smotthly, which rarely happens on those platforms.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]