cons-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Request for comments: CONS specification


From: H. S. Teoh
Subject: Re: Request for comments: CONS specification
Date: Thu, 27 May 2004 21:33:15 -0700
User-agent: Mutt/1.5.6i

On Fri, May 28, 2004 at 02:22:20AM +0200, Pierre THIERRY wrote:
> I added some comments made in private to my SRS, and wrote some cleaner
> webpages to collect everything:
> 
> http://arcanes.fr.eu.org/~pierre/CONS%20revival/Documentation/
[...]

Cool, here are some comments:

Under "Simplicity":  It's probably impractical to allow the user to
specify just a small number of .o files and let Cons guess the rest.
First, extracting class/function information requires a full parser for
the language (or very close to a full parser), which might be overkill
since this should be the job of the compiler. Second, Cons would need to
build the .o files before it can know that a particular function is
needed, so you have a dependency tree that changes during the build
process. Third, if the user has old source files or sources for alternate
versions of a module, Cons might link in something it's not supposed to:
e.g.,
        module1.c defines func1()
        module2.c defines func1()
        main.c calls func1()

What the user wants is to link either module1.o or module2.o, but not
both. (Perhaps module2.o is old, unused code. Or maybe new untested code
not ready to be compiled.)  But if this is not specified, Cons doesn't
know which module to build.

Having said that, though, we *should* make Cons require minimum
information to build what it needs. E.g., if specifying a list of object
files is enough, then the user should not need to explicitly state to run
the compiler for each source file. If there is an ambiguity, the user
should be able to just give enough info to resolve the ambiguity, and Cons
can carry out the rest of the dependency calculation; the user should not
need to explicitly state everything just because he needs to explicitly
state one thing in the middle of the dependency chain. 

One idea that occurs to me is that the user can state dependencies without
stating the file types, and Cons should be able to figure out what to do.
For example, you can say "a.xml depends on b.xml and c.xslt", and Cons
should automatically deduce that it should run:
        Xalan -in b.xml -xsl c.xslt -out a.xml



Another thought I just had, which is related to the LaTeX iteration
problem and also the dependency problem: after thinking about it more, I'm
now more inclined for Cons to build its dependency graph *during* the
build process, rather than everything before. Here's the reason:
basically, the LaTeX iteration problem is *exactly* the same as the Flex
.l dependency problem. The reason document.tex can produce a different
document.ps when you run LaTeX a second time is because document.ps
actually depends not just on document.tex, but on document.aux and
possible document.toc and other files as well. I.e., the full set of
dependencies for document.ps is:

              document.ps
                    |
              document.dvi
             /      |     \
 document.tex document.toc document.aux ...

Of course, in a clean source tree, document.toc and document.aux don't
exist; they are generated from document.tex. The full definition of the
LaTeX file type would be:

        Inputs: document.tex
        Outputs: document.dvi, document.aux, document.toc, ...

So when Cons wants to build document.ps, it needs to build document.toc
and document.aux as well. It does this by running LaTeX on document.tex to
produce document.dvi, document.aux, ... etc., and then converts
document.dvi to document.ps.

Now the catch here is this: since document.aux is one of the depedencies
of document.dvi, Cons *should* re-build it if document.aux changes. Sure
enough, after you run LaTeX the first time, document.aux is updated. If
the checksum for document.aux is different from the cached checksum, we
know that we need to re-build document.dvi. This can only be determined if
Cons scans document.aux *after* it runs LaTeX the first time.

If Cons does this scanning, though, it will *automatically* always produce
an up-to-date .ps file, because each time it runs LaTeX, it detects a
change in one of the auxilliary files, so it will re-build the target
until the dependencies have stopped changing. This will also solve the .l
dependency problem: after running Flex, Cons scans the output .c file and
finds out that it depends on scanner.h, and so it can produce the correct
list of dependencies. This way, we will also automatically ensure that
scanner.c is built in a correct way.

So I guess it *is* true that a file is up-to-date when its source files
have not been modified---we just have to be careful to include *all* its
source files, not just what we normally think of as source files (e.g. 
LaTeX's .aux files technically are a part of the input that produces the
output document and must be included in the list of sources. If we only
count the .tex file as a source, it's no wonder we don't get an up-to-date
output file.) 

As far as implementation is concerned, this does make things a bit
complicated, since Cons won't know the full dependency tree before it
starts the build. It will, however, know the *skeletal* tree, which I
think should be enough to proceed. Doing parallel builds on a partial
dependency tree shouldn't be too bad, since hopefully this kind of cases
only happens rarely, so the partial dependency is "almost" the full tree,
and splitting it up for a parallel build should be a "good enough"
approximation even if some of the steps need to change when Cons discovers
new dependencies during the build.


T

-- 
We've all heard that a million monkeys banging on a million typewriters will
eventually reproduce the entire works of Shakespeare. Now, thanks to the
Internet, we know this is not true. -- Robert Wilensk

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]