[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] programming in the large (Re: On configs and huge s

From: Thomas Lord
Subject: Re: [Gnu-arch-users] programming in the large (Re: On configs and huge source trees)
Date: Tue, 18 Oct 2005 18:29:05 -0700

Alfred: you impressed me by digging fairly deep (e.g., coming up with
the unexec bogosity in package-framework) so I have a fairly long and 
detailed reply for you.

And yes, dear trolls, this does remain Arch relevant.

 Tom> Alfred argues that Autoconf plugs into the role of a configure
 Tom> for "programming in the large" because it handles sub-projects.

 Tom> Eh.

 Alfred> I guess you haven't ever used it for something big (tla is
 Alfred> quite small in my opinion)

My first full-time job, back before I was fully into free software,
was writing a configure/build system and applying it to a reasonably
large system (the Andrew toolkit and the suite of applications built
on that).

  Tom> Autoconf has climbed too far up the dependency stack (meaning it
  Tom> relies on too much other software).

  Alfred> Tom, you're smart, and I like you.  

Thanks.  You seem smart, too.

  Alfred> But stop being silly, autoconf relies on less software than
  Alfred> your package-framework (it relieas on awk and printf in
  Alfred> addition to the tools that autoconf relies on).

There are two phases in autoconf (and this is part of its problem).
One phase is translation of `' and the other phase is
execution of `../configure'.

The first phase of autoconf relies on GNU m4 and perl.  Increasingly,
in practice, it relies on automake and libtool.  While jocularly
dismissed in the autoconf documentation, the circular dependency
between m4 and autoconf is a clear bootstrapping bug.

The awk dependencies in package-framework are extremely slight --
easily done away with.  I don't claim that package-framework is the
right code-base to start-with, only that it's a good demonstration of
what is achievable.

One thing package-framework, in combination with a portability library
like libhackerlab demonstrate (to, let's say, a solid
working-prototype level) is that applications don't need the 
two-phase hair of auto*.  It's simpler, at least as effective,
and certainly more easily maintained to have applications above
the bootstrapping level depend on just a minimal GNU development

The portability gymnastics of that minimal development environment
are worth it, sure -- but auto* discourages leveraging the benefits
of going through that effort.

Add to that the number of packages that make poor or outright
incorrect use of auto* -- and the size of the documentation and
obscurity of the codebase for auto* -- and you should start to 
wonder who is benefiting from its widespread use and why.

   Tom> Autoconf, at least as commonly used, is lousy at dependency
   Tom> discovery and awkward to control to override its defaults.

   Alfred> I disagree, --with-FOO=/dir/to/foo is quite flexible.  Far
   Alfred> more flexible than hard coding crti.o as you have done for
   Alfred> unexec on GNU/Linux platforms (did you know that the
   Alfred> standard location for C run time init object files is
   Alfred> actually /lib on GNU/Linux?)

Yeesh, you dig deep.  I like that.

The code to which you refer is vestigial -- left over from an
experiment I did quite a few years ago to provide an emacs like
`unexec' for systas scheme.

Again, I don't claim that package-framework is polished and ready to
go, only that it's a good demonstration.  So far as I know, it is safe
(and appropriate) to simply delete the `src/build-tools/configs'
subdir and the small amount of script code and Makefile scraps that
depend on it.

So, sure -- that code's a bug but one that's of no consequence.

  Alfred> Infact, I think that normal users should simply use binary
  Alfred> packages.  If you are a developer and wish to hack on
  Alfred> something, it is trivial to configure a program.

Actually, I strongly agree but with a qualification.

"Normal" users should be getting binaries, yes.

Those binaries should come from a competent supplier, preferably
geographically or at least logically close, and I personally have 
in mind a ratio of engineers to users.  I don't have my figures handy
but I recall working it out to something like 30 per 10,000.

That prices out, to consumers, like a fairly inexpensive premium cable
channel per personal computer, with money left over for R&D and,
across a nation or a globe, lots of paid labor left over for free
software R&D.

And I mean that 30:10,000 ratio to be specific and real.  You've got
those 30 running a 10k-seat distribution business *and* a feedback
community.   That gives lots of redundancy and grounds R&D in what
"normal" users are thinking and doing.

  Tom> In other words, it does sorta ok at looking in "standard
  Tom> locations" to find a dependency but that facility doesn't seem
  Tom> to well-handle the case when you have sibling source components
  Tom> in a tree being installed in a non-standard place.

  Alfred> And package-framework does not fix any of that.  

Weakly agree.

The REQS and OPTS dependency sorting stuff is part of a solution.
The emphasis on code-layout within a tree as a guide to simplified
construction is part of a solution.   The unfinished
package-dependency stuff in there would round it out.

All of the above could deal with another pass now that experience
has been gained but there it is.

So, you're right, but still package-framework has good stuff to 
say about the topic.

  Alfred> The way you
  Alfred> solve it with package-framework (from the looks, I only took
  Alfred> a brief look at it right now so I might be completely of
  Alfred> base) is that you include each library that is needed.  Say
  Alfred> you have this little GNOME program that needs some parts of
  Alfred> GNOME, would you distribute the whole GNOME suit just so
  Alfred> that you compile the program?

Well, yes partly.  I'd certainly like to be able to instantiate such
an environment without standing on my head.  It should suffice, for
that purpose, for the maintainer of the little GNOME project to 
publish an Arch-type `config' file.

And, no -- I'd also want more standardized and better designed install
conventions so that I can mix a bunch of packages in one tree, give
one parameter, and have the `--with' stuff filled in for all the
sub-packages from that (so to speak -- the literal mechanism might
be different).

  Alfred> Then there is the major deficency of tla using static
  Alfred> libraries for hackerlab.  Assume that you have a dozen
  Alfred> programs using hackerlab, and you find some security issue
  Alfred> or what not in some function, you will end up recompiling
  Alfred> everything.  Simply out of the question when you have a few
  Alfred> hundred programs.

I think dynamic libraries are overrated and widely abused but, yes,
they are also sometimes very valuable.

The package-framework demonstration *would* have support for them
if libtool authors had bothered to float their collected knowledge
of how they work on various platforms in some form other than
their source code.   There's a few man-months project there to tease
that information out into a more useful form (ideally making 
libtool itself more data-driven from that database of wisdom).

  Tom> Autoconf has also become notoriously bloated, etc.  It's never
  Tom> quite stabilized, even after all these years, which should at
  Tom> make one suspicious.

  Alfred> Once again, I ask you to stop being silly, autoconf has been
  Alfred> stable since 2.50 when it got a huge overhaul.  GCC is in
  Alfred> more flux.  It also has less bloat than tla, which
  Alfred> implements its own C library just cause you happen to
  Alfred> dislike libc for whatever silly reasons, while still needing
  Alfred> to link against libc!

Between 2001 and 2004 I made various attempts to download packages in
source form to a FreeBSD system and build them.  When packages had
lots of prereqs, config/build/install bugs in auto*-using packages
was most often the show-stopper.

  Tom> One thing I wanted to show with package-framework and hackerlib
  Tom> is that you can standardize a package-combining system and use
  Tom> portability libraries and then you don't need autoconf's hair.

  Alfred> Once again, you do not standardise something by inventing
  Alfred> something new.  I also fail to see what the exact hair in
  Alfred> autoconf is, and I'm far to familiar with autoconf.

The two phase thing and its consequences.   Lots of packages wind up
with "3rd party macros" that may work on Linux but sure didn't on

I admit that my experience is anecdotal and my conviction is based on
a priori consideration of the approach and actual code of auto*.  It
could be refuted or made more rigorous by looking more carefully at
what labor goes into, for example, FreeBSD ports or Debian packages.

  Tom> Alfred cites unoptimized strcmp as source of tla performance
  Tom> issues.

  Alfred> No I didn't, I said that it _might_be_ a source for some of
  Alfred> tla's performace issues.  If I had cited it as a source I'd
  Alfred> provide a hard numbers.

Sorry to have mischaracterized you.

My understanding of the numbers is seat-of-your-pants engineering
rather than a patient careful study.

It's definately true that naked benchmarks of the `str_' functions
can't compete with good native `libc' semi-replacements.  It's
consistently looked to me like this was never a big deal in tla
performance and the trade-offs (e.g., code so simple it serves as
documentation) have, so far, been more worth it than not.

  Tom> by letting go of leadership on, for example GCC.

  Alfred> The FSF never let go of the leadership on GCC, they still
  Alfred> are and always have, been the leadership.  They just did
  Alfred> some changes in how it was exactlly managed (i.e. one person
  Alfred> maintaining the whole thing and getting loads and loads of
  Alfred> bad patches, sound familiar?).

The FSF doesn't have any serious leadership of GCC.  It has some
loyalty over narrow issues.   E.g.: GCC development involves paper
assignment forms.  E.g.: GCC developers are not eager to see the 
compilation phases split in certain ways (librifying parts of GCC)
since that would undermine the GPL (as opposed to LGPL) licensing
of it.

But, quick thought experiment:  suppose RMS decided tomorrow that 
it was desirable to float a simplifed bootstrapping compiler or 
conduct a particular major code cleanup in a systematic way.  Suppose
we wanted 10% of the effort to go in that direction.  What do you 
think would happen?

  Tom> I rely on plenty of tools that already exist and replace a
  Tom> relatively small subset with tools that have some advantages.

  Alfred> Instead of replacing (you're quite fond of that it seems)
  Alfred> why not fix them and add more advantages instead of doing
  Alfred> complete rewrites?  It will save both you (no need to
  Alfred> rewrite the whole thing), and others (no need to try and
  Alfred> understand how your rewrite differs) times.

Details matter.  Note that hackerlib, string functions
notwithstanding, don't actually (despite all claims to the
contrary) replace much of libc at all.  The `vu_' subsystem,
with the sole exception of `printfmt', doesn't reimplement 
squat of libc (and, indeed, relies on libc).  Similarly for
many other subsystems.  The `rx' subsystem does reimplement
some standard functionality but with radically different (and,
I think, better) performance characteristics and an expanded
API.   Out of the bulk of `hackerlib' there are a few 10 string
functions that people complain about and out of *that* alone
people construct arguments such as yours.

  Alfred> My major grief with hackerlab is the rewrite of standard C
  Alfred> functions, strcmp, printf, ...  There are infact many nice
  Alfred> things in hackerlab, but the majority is just a silly
  Alfred> rewrite of the C library for no apparant reason.  Seriously,
  Alfred> I really cannot understand how you can justify a rewrite of
  Alfred> something silly as strlen!  It just makes it a hell for
  Alfred> anyone who knows C to figure out how exactlly each new
  Alfred> little function behaves.

Hardly the majority.

I took my necessity for replacing `printf' from two things: (1) not
wanting to have to depend on (the very fine but excessive for 
this purpose) GMP;  (2) certainly not wanting stdio-style buffering.
I totally swiped my better approach to buffering from Andrew Hume
and improved his approach by combining it with `vu's system-call

The `str_' functions have a more regular interface than libc.
There would certainly be no harm in porting, linking with, or
otherwise inheriting the work (where it overlaps) on
platform-optimized libc work-similars but it has never, in fact,
been worth the time.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]