[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] Regarding the release...

From: Vadim V. Zhytnikov
Subject: Re: [Gcl-devel] Regarding the release...
Date: Sat, 01 Mar 2003 15:18:51 +0300
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU; rv:1.3b) Gecko/20030206

Camm Maguire ?????:
Greetings!  Well, I'm sure we may have a few more minor issues, but
current CVS works as follows for me:

1) Successfully compiles in 4 basic flavors, ansi/trad, bfd/dlopen
2) Compiles itself
3) Runs Paul's suite in the ansi case identically in both cases
4) Compiles maxima with successful make check in all 4 flavors
5) Builds acl2 successfully in trad flavor (needed by acl2 design)
6) Debian package builds and ships both images, and uses environment
        variable GCL_ANSI to toggle behavior
7) Build of both will work in principal on all 11 Debian arches
        (waiting for confirmation from autobuilders, 64 bit ia64 and
        alpha tested positively by hand)
8) The Debian package will now run Paul's suite on all platforms, so
        we can make comparisons.  All look identical thus far.


Some issues that I've discovered:

1) Earlier reported difference in maxima compilation was due to
   maxima's eliminating their sys-proclaims if :ansi-cl is defined.
   Eliminating that 'elimination' gives the same (basically, see below)
   C files as the traditional case.

2) I thought that this would be the performance culprit for ansi
   maxima, but it does instead appear to be due to the size of the
   ansi image.  maxima make check runs about 20% slower with ansi.

I'm sure that performance difference between classic and ansi
gcl is solely due to GC.  With ansi we have a lot of extra
lisp objects in RAM. So we have less free space and more GC
number and GC time.

Take a look on some funny numbers below.  This is time and RAM
required to compute ratsimp((x+y+z)^300)$ on Linux AthlonXP 2400+.
For GCL run time is in the form T - G = N, where T is the total
time as shown by showtime:true; G is total GC tome and N
is run time without GC.

Lisp            Time            RAM      RAM    RAM
                [sec]          before    max   after
             T  -  G  = N       [Mb]    [Mb]    [Mb]

CLISP       4.6                 5.5      29      16

CMUCL       1.6                 6.5      31      31

GCL class   5.9 - 5.2 = 0.7      8       24      24
GCL ansi    9.5 - 8.9 = 0.6     9.5      29      29

GCL class   1.0 - 0.4 = 0.6     24       31      31
GCL ansi    1.1 - 0.6 = 0.5     25       32      32

GCL class   0.7 - 0.1 = 0.6     48       55      55
GCL ansi    0.5 - 0.0 = 0.5     49       56      56


Results for GCL are presented in three series.
First, with default memory allocation.  In this situation
GCL seems to be very slow. In fact it spends almost
90% of run time in GC.
In second tests I allocated some extra memory for cons
cells before starting the test.  Performance improved
dramatically and now only about 40% of time is GC.
Finally some more RAM preallocated.  With such large RAM
GC's almost disappear and GCL is almost 3 times
faster than CMUCL.

It is clear that ansi GCL isn't slower than classical
GCL (it may be even faster a bit).

The apparent GCL slowness is due to the way how GCL
allocates new RAM.  If memory is low GCL spends
enormous amount of time doing GC and only after
this small amount of RAM is completely exhausted it
allocates new chunk of free space.  This strategy
is good if total available RAM is low and probably
was quite optimal for old computer systems.
I'm trying to figure out how to change memory
allocation strategy to make it more suitable
for modern computers.  The ultimate goal -
to outperform CMUCL (or at least get close to it).

3) pcl and clcs files still need to be rebuilt from lisp, as there are
   arch specific constants written into the C files, (I only noticed
   STREF offsets).  This observation also led to some 64bit structure
   debugging.  All looks good now on 64bit, both traditional and ansi.

4) There is a compiler bug in the volatile variable detection,
   necessitating a makefile applied patch of pcl_methods.c for now.
   If anyone wants to look into this, grep on setjmp and volatile in
   cmpnew/*lsp.  The idea is that variables cannot be put into
   registers if they are used in a block which could be accessed via a
   longjmp, i.e. throw/catch.  The code that does this in gcl C files
   manipulate the frs stack and are thus labelled.

5) There is an (apparently small) compilation output difference
   between the ansi and traditional images.  The ANSI writes certain
   closures with the 'turbo closure' mechanism, which looks to be an
   improvement.  Until this is adequately tested, the makefiles use
   the traditional image to rebuild the lsp and cmpnew core C files.

I'm going to tag this any minute now as 2.5.1.  Tomorrow is the last
day I'll be able to do anything official.  But it looks quite good
now.  Congratulations to all!

Sincere thanks to all who contributed to GCL development!
And especially to Camm.  You are great!

A year and half ago I was is bad mood since it was
apparent that GCL is going to perish...
I'm very happy that I was wrong!

P.S. If anyone would like to write a short blurb about the release,
I'd be most grateful.

Take care,

     Vadim V. Zhytnikov


reply via email to

[Prev in Thread] Current Thread [Next in Thread]