gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [OT] GCC (was Re: Re: Command abbreviations)


From: Tom Lord
Subject: Re: [Gnu-arch-users] [OT] GCC (was Re: Re: Command abbreviations)
Date: Fri, 19 Mar 2004 10:14:29 -0800 (PST)


    > From: "Pierce T.Wetter III" <address@hidden>

    > >>> Anyway, GCC isn't the only game in town and, looking at my crystal
    > >>> ball, it's going to be blown away w/in the next 10 years by something
    > >>> about 10x smaller (measured by source-code).  Lcc is most definately
    > >>> not that GCC killer -- but it proves the point.

    > >>    I don't know why that would be true, since it would reverse the 
    > >> trend
    > >> of all the other software projects in existence.

    > > Out of curiosity, please elaborate.

    > Once upon a time, programming was a lot harder then it is now. [....]

    > These days, there's so much memory, so much processor speed, and so 
    > much disk space that everything just keeps bloating, because it doesn't 
    > have to be concise and elegant. Good programmers spend their timing 
    > solving more different problems, and ignoring speed or size. But its 
    > the bad programmers who then end up implementing all the other stuff, 
    > and its crap, and there's more of them.

As a rule of thumb, that's true.  Partly you can think of it as an
aspect of the commoditization of programmers, just as in other skilled
trades.   You can do a lot by designing a software system in such a
way that it can be extended over and over by "line coders" who
specialize in focusing on just one little set of
objects/routines/whatever.   It can be easy (though it is not the
necessary outcome) to lose control over the overall architecture and
wind up with a big bloated mess.

It may generate bad code.   It may create more longer-term problems
than it solves shorter-term problems.  But the key thing is that it is
a reliably reproducible process so you can plan a business model
around planning to do it, and then you can execute that plan.
So, people spend a lot of money to go this route, repeatedly, and
there's a lot of it around.

The "geniuous stuff" -- say, for example, the ultimate airline
reservations system -- is rarer.   And for that kind of stuff, people
do indeed bring out the genious programmers to come in and do
something slick and clever.

It seems like a spectrum, to me, and projects slide back and forth
along it over their lifetime.   One common pattern, and GCC is an
example, is a project starting out in Genious Mode where, what the
genious leaves behind is a framework that "lesser" hackers can then
extend a gazillion ways.    Not every aspect of GCC is like that --
but the first few years of its commercial life did involve a lot
(highly skilled but still basically rote) porting to new
architectures.

These days, the commercial focus of GCC looks to me to have shifted a
bit back towards emphasizing Genious Mode.  Ports still happen -- but
there's a lot more activity on fancy new optimizations and other kinds
of deep changes.  This shift is also, it looks to me, generating a
renewed attention to overall architectural issues of GCC that make
such big changes easier or harder to write (e.g., the introduction of
garbage collection a few years back).


    > I'm not sure that any of the applications I use today work as well as 
    > say, MacWrite 1.0, in its 64K.

    >   That's what provoked me to respond really. I miss Lotus Jazz every 
    > time I use Excel, it was a better program...

Well, yeah.   Whole other rant there.   "Productivity software" and
"GUI apps" are areas where the commoditization process has gone badly
wrong and, while I won't get into in this message, that's another area
where there's potential for a revolution.

But, back to GCC:

    >> My claim is grounded, to whatever degree it's grounded at all, in
    >> technology considerations: that the essential information content of
    >> GCC source code can be far more concisely expressed;

    > We can both be right on this one. There's a big chunk of GCC that's 
    > actually produced by bison and flex. Adding a regular expression engine 
    > library to my code made a lot of the other code much simpler. So I can 
    > definitely see your point, that there can be "levels" of source code, 
    > that could produce GCC from a much higher level description. Especially 
    > compiling is an extremely well specified problem.

The interesting bits of compilers are a big exercise in symbolic math
with heavy emphasis on set and graph domains.  A non- or
barely-optimizing compiler can be understood as nothing more than
producing a parse tree, maybe doing some rewrites and annotation of
that, and then pattern matching to produce code.  Optimizing compilers
toss in some additional graphs, find fixpoints/solve for constraints,
move things around in some mathematically straightforward way.....

If you didn't care about how fast your compiler would run, you could
write it in a symbolic math language.  Large swaths of the compiler
would look at lot closer to the abstract expression of an algorithm
you find in a compiler paper.  You could, having relaxed the
performance constraint, do Genious Mode compiler work much, much more
quickly, cheaply, and accurately.

But, compiler performance matters.   Consequently, the way it's shaped
up, GCC has evolved into a run-time environment which is a great
illustration of Greespun's Tenth Rule of Programming.  The experts are
spending a lot of time "hand translating" the math abstractions into
GCC-situated implementations.

My hypothesis about the future of compilers is that the problem of
compiling a high-level description of a compiler into a fast compiler 
is a tractable problem and one that's economically attractive to
solve.  Today's "compiler compilers" are pretty weak.   I think they
can get much, much, better.

GCC today is in a state where one of its competitors is ICC.  ICC
often does just a bit and sometimes quite a bit better -- but at every
step there's always something in the pipeline for GCC that will close
the gap.   It just takes months and months to get it there.

A really good compiler-compiler would change that dynamic.
Optimizations could be coded and tested within days and if it takes a
few more days to compile them into a fast compiler, so what?


    > So that's how you're right. In fact, I would go on to say that the 
    > great strength of Perl & Python and their success as a language is that 
    > you can code at a much higher level because arrays and maps are so well 
    > integrated into the language. 

Right.  And, if you can imagine that a bit fancier and specialized for
compiler algorithms, you'll see what I'm saying.   A compiler compiler
could raise the abstraction beyond scripting-language arrays and maps 
to math-style sets, sequences, and functions -- taking over more of
the problem of choosing representations.   It could churn for hours or
days working out function compositions to optimize away
multiple-passes over various data structures.   It could know how to
implement "fixpoint" instead of "while", and things of that sort.


     > How I'm right is that if I count the source code to Python or Perl as 
    > part of "our" source code, it will take a long while before I've 
    > written enough negative code* to make up for the size of the Python or 
    > Perl executables. When someone writes hlcc (high level compiler 
    > compiler) to replace most of the guts of gcc, I would suspect that much 
    > of the current gcc code would end up moving to that. So now to work on 
    > gcc, you have to work on TWO programs: changes to hlcc, and changes to 
    > gcc...


I don't think it will go that way.   I would expect something like:

~ a big toolbox of data structures in some ML-family language
~ a high-level language based on sets and functions
~ an a.i. program that compiles the high-level language into MLish,
  with nearly no constraints on the compile-speed of this process
~ some static analysis, simulation, and lower-performance interpretation
  for the high-level language to support the edit/compile/debug cycle

and then, yeah, a compiler that benchmarks better than GCC and ICC
written in, like, 100 pages of code, 10 additional pages per port.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]