Re: GSoC: OOD detection

make-alpha

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GSoC: OOD detection

From:	David Boyce
Subject:	Re: GSoC: OOD detection
Date:	Sun, 29 Apr 2007 19:09:17 -0400

At 02:54 PM 4/14/2007, Paul Smith wrote:

Hi all; sorry this is a bit slow.  Soccer season is starting and I'm
very busy!  This email will be a bit of a brain dump so please bear with
me.

Likewise. Busy, distracted, skimming this thread where I should bestudying it, etc....

Before anything else, it's important to realize that there are two
distinct, yet interdependent capabilities we are discussing here: the
first is the ability to use a separate algorithm for OOD determination,
and that's the one we've been talking about so far.  But the second one
is at least as important and, in my opinion, the more challenging
design-wise: that is stateful make; the ability to keep some state
information across invocations of make.  I don't think there are too
many OOD algorithms that you can choose that wouldn't require persistent
state.  Deciding how to store that state, especially when you don't
really know what format it will be in (obviously the state to be stored
will vary with the OOD algorithm chosen), provide it to the OOD
algorithm, etc. is a design challenge.

I should preface my remarks by acknowledging that many of you,certainly Paul, have been thinking about make much longer and harderthan I have so I may well be misunderstanding or oversimplifying someissues. But I hope you'll at least consider my argument that itdoesn't need to be as complicated as this.

The way I've always imagined this is that make would deliberately*not* address the problem of persistence. Instead it would defer thatto the particular OOD override, which at least has the virtue oflaziness (pace Larry Wall). Let's start by considering the"competition"; ClearCase is certainly the best-known, most widelydeployed tool which currently offers advanced OOD detection and itkeeps its persistent data in a network database. So let's say I orsomeone else wants to implement full ClearCase-like functionalityusing GNU make. A database may well be preferred to sentinel files insuch an environment. In fact if we want to be able to share ourstateful knowledge with someone not operating in the exact same filetree, that may be necessary.

In fact let's take this to its logical conclusion: what if one couldreverse-engineer the CC network protocol and wanted to tap into itsdatabase for OOD decisions? I have no plan (or hope) of doing thisbut it's not inconceivable that the vendor might contribute animplementation, and in any case it serves to illustrate the pointthat make need not handle its own persistence.

Let's consider a possible design off the top of my head. Say wedefine a struct containing all potentially pertinent data for OODdecisions (using short names for now because I'm a bad typist):


typedef struct ood {
        int ood_version;
        char *ood_targets[];            // vector of paths to targets
        char *ood_prereqs[];            // vector of paths to prereqs
        char *ood_envp[];               // traditional environ vector
        char *ood_cwd;                  // current working directory
        char *ood_script;               // the build script
};

That's all I can think of which would affect a go/no-go decision butby using a struct and storing ood_version we give it a faint OO glosswhich would allow for extensibility in case something else turns up.Make alway calls the ood() function and passes it the above structwhenever such a determination needs to be made. The default algorithmwould be to simply compare the dates of the targets to those of theprereqs, in which case state is handled for us as it always has been.If an override algorithm is detected, the override function not onlyhas all the information it needs for the OOD decision, it has enoughdata to store that state too[*], which it could store in a file or bywriting to a socket or whatever.

[*] I see the first flaw already, which is that as Paul said thestate must be stored *after* the build script is run while the OODdecision is made before. So this basically means you'd need ood_pre()and an ood_post() functions.

To my mind dependence on sentinel/stamp files is at least arguably ahack and I'd prefer that the design didn't require them. I also thinkanything which keeps the core of GNU make simpler is a good thing. Ofcourse pushing persistence off to the user might make theimplementation of these extensions a little more complex but youcould deal with that by taking the same code you'd use for storingfile-based state and stick it into a documented library instead oflinking it into the make program.

Expanding on a previous point: it seems impossible to encode detailssuch as "connect to port 9382 on machine foobar and send the namesand MD5 states of the prereqs down the wire, then let me know whatanswer comes back" in a make variable like .OUT_OF_DATE. You'dbasically be forced to write a little client program to do so and runit with $(shell) which would lead to performance issues, especiallyon Windows which is not optimized for quick cheap process creation.OTOH I do see that the ability to use target-specific settings wouldbe quite elegant.

So to sum up: my argument is that OOD is conceptually pretty simple:(1) find all the places where datestamp comparison is done now andbring them all through one API, and (2) come up with a way for thatAPI to be interposed. Am I missing something important?


David B

PS Both my model and yours would appear to suffer from an obviousrace condition; what if something happens to change one or moreprereqs between the "pre" moment (when OOD determination is made) andthe "post" moment (when state is stored), either as the result of abadly designed build script or what ClearCase calls "interferencefrom another process"? It seems some transitional state must bestored within the make process. Maybe building on your idea, theood_pre() function could return a char pointer which would be null ifthe target is up to date and otherwise a valid string. This string isthen passed into the ood_post() call for it to use as desired. Thetypical use would be to remember size/date/MD5 of the prereqs fromthe pre call and check that they're unchanged in the post.

[Prev in Thread]

Current Thread

[Next in Thread]

GSoC: OOD detection, Paul Smith, 2007/04/14
- Re: GSoC: OOD detection, Ramón García, 2007/04/14
  - Re: GSoC: OOD detection, Paul Smith, 2007/04/16
    - Re: GSoC: OOD detection, Ramón García, 2007/04/16
    - Re: GSoC: OOD detection, Paul Smith, 2007/04/16
    - Re: GSoC: OOD detection, Ramón García, 2007/04/22
- Re: GSoC: OOD detection, David Boyce <=

Prev by Date: Re: Current patch
Previous by thread: Re: GSoC: OOD detection
Next by thread: Current patch
Index(es):
- Date
- Thread