[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Some "fun" last weekend
Re: Some "fun" last weekend
Thu, 21 Jul 2022 10:23:20 -0400
Evolution 3.44.3 (by Flathub.org)
On Wed, 2022-07-20 at 20:33 -0400, Dmitry Goncharov wrote:
> if we take a step back, what is the problem? The problem is the
> presence of jobserver-auth in MAKEFLAGS in the non recursive case.
> You already implemented a solution which sets jobserver-auth=-2,-2.
> Another option is to remove jobserver-auth from MAKEFLAGS. This would
> rob make of a chance to notify the user that the program cannot
> participate in the job server protocol, the file descriptors cannot
> be opened. That may be better than the subtlety that you described
> above. On the other hand, if we were to rewrite the current impl
> with e.g. named pipes, the problem of jobserver-auth in MAKEFLAGS
> would stay, would it not?
No, because the problem is not really with jobserver-auth itself.
The problem is fundamentally that make is using open file descriptors
to pass information from parent makes to sub-makes, combined with the
fact that we don't actually know with 100% accuracy which processes we
start are really sub-makes, and which are not. We have a heuristic
which is not always accurate. This leads to the following problems:
1) We have to be very careful about close-on-exec: when we invoke a
sub-make we need to disable close-on-exec for these fds, and when we
invoke a process that is not a sub-make we must enable close-on-exec.
2) Even though a sub-process may invoke a sub-make, it may do other
things as well. Often a recipe does multiple things in the same
subshell, one of which is invoke a sub-make. It's impossible for us to
ensure, in these cases, that the jobserver is available only to make.
3) If other processes see the open fds and start reading/writing them
(this has happened before) then they'll mess up the jobserver
4) Some processes just close all fds and this causes problems with the
jobserver. For example we have an issue with Python where it will, by
default, set close-on-exec on all open file descriptors before it runs
any subprocess. If you don't realize this and you use Python as part
of your build system as an intermediary between parent and child makes,
it breaks things in inscrutable ways.
5) The Savannah bugs also mention other issues:
6) There is another issue with setting blocking / non-blocking read on
the jobserver fds. I can't remember if there's a Savannah bug or not
about this, but changing the blocking/non-blocking status on a fd is
not local to a given process and this has caused problems for some
applications used with make. I'll have to try to locate the info on
So, why do named pipes help?
They help because we're not expecting the open fds to be passed down;
in fact we can set close-on-exec for ALL our (non-standard) fds, which
is what you'd want. The parent make doesn't need to use a heuristic to
figure out whether the child process is a make or not.
Instead, any sub-process can look at MAKEFLAGS, see the value of
jobserver-auth, find the name of the pipe, open it, and start to use
it. If the sub-process doesn't know anything about MAKEFLAGS, it will
never know that the jobserver is relevant. If the sub-process knows
about it, it can participate. Everything is up to the sub-process and
nothing is required of the parent make.
This also allows an arbitrary number of intermediate processes to come
between the parent make and the sub-make, since we don't need to force
all the processes to preserve some resource (open fds) across exec
calls to make sure everything continues to work.
Semaphores would be the same: we pass down the NAME of the semaphore,
and each sub-process that cares would use it. But as I discovered we
can't use semaphores because we don't have a reliable way to create an
event handler that works both with semaphores and with SIGCHLD (that I
could find), other than polling which I don't like.
The downside is we need to write code to manage the named pipe