guile-gtk-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Profiling guile-gtk


From: Andy Wingo
Subject: Re: Profiling guile-gtk
Date: Sat, 15 May 2004 14:40:00 +0100

Hey,

I'm spamming the list a bit, but I'm also making guile-gtk faster, so I
don't feel so bad ;)


Improvements
============

I implemented latent bindings for types. So for instance, <gtk-hbox>
isn't constructed as a scheme class until the first time it's referenced
in the source. Thereafter it's cached in the module's obarray. I had to
change the way generics are made, so that they only resolve their
specializer types when the methods are constructed.

(gnome gtk) load time is down to 2.05 seconds, on my crappy box. Sweet.
Only one second to go before we reach python. The changes are in my
branch of g-wrap and guile-gnome[0].


Profiling by module
===================

Before resting content with the current status, I'd like to record the
current hot spots in the code, using similar profiling methods as
before. I'll show the times in seconds as well as a percent of the total
time. Note that this is from one run, and the sampling rate was 100 Hz
-- the values are becoming less significant. Also, the 2.05s is from an
unprofiled run -- this run was 2.72s due to profiling. Thus these times
are inflated, especially those with deep call stacks.

This table shows the time spent loading the g-wrapped modules:

    Module      Time (Percent)    Time (s)
    ---------------------------------------
    gobject     -                 ----
    glib        0                 0.01
    atk         1                 0.02
    pango       2                 0.06
    gdk         5                 0.13
    gtk         36                0.97
    =======================================
    TOTAL       44%               1.19
    Other       56%               1.53

Based on comparing "time guile -c (use-modules module)" for (gnome gtk),
(gnome gobject), and (oop goops), I conclude that of the 56% of other
time,

    Module           Time (Percent)    Time (s)
    ---------------------------------------
    only boot-9      10                0.215
    goops            9                 0.183
    gobject          26                0.530
    gtk scm code     11                0.221 * calculated by difference

Interesting that non-g-wrap code is using most of the time.

Profiling by procedure
======================

Of the 44% spent loading the wrapsets, 40% is within
gw_wrapset_register. The other 4% I think is moving throught that huge
gtk init function -- lots of blocks on that one. Perhaps if we could
have g-wrap output a static structure describing the functions, that
could shave 3-5% off the load time. It would certainly save on the file
size.

(Parenthetical note: we should also consolidate the wrap/unwrap/destroy
functions for gobjects.)

Within the 40% in gw_wrapset_register, 8% (or 0.22s) goes to defining
gsubrs. Those are wrapper functions that are not handled by libffi, of
which I think there are only about 100-200. Essentially no time is spent
making the subrs; the time is all in defining them. Some of the time
will go to making generics when names collide, a minimal part for
latent-binding bookkeeping, a very small part for defining dynprocs, and
most of it for definition.

One surprising outcome of the profiling is the amount of time spent in
scm_c_define: 20%. There are two costly parts to that,
module-make-local-var! and scm_str2symbol. I can't count the former, but
the latter takes fully 13% of the total load time.

Type definition time is down to 20%, or 0.4 seconds. Generics definition
is 10%. At least, that's how I interpret the 10% and 20% anonymous
values within libgwrap-guile-runtime -- those must be the binder procs,
and since the generics binder calls scm_c_lookup on types, I figure it's
the larger percentage. These times are affected by (1) the number of
symbols in the root modules that collide with generics, and (2) any
custom scheme code that implements methods or derives types. Otherwise
the bindings would remain latent.

A flat profile doesn't tell all that much. GC takes about 40% of the
time. g-wrap and the individual wrapsets don't even show up on it. 7% of
the time is spent in ld-linux.so.


Where to from here
==================

Improvements on our side
------------------------
We're going to have to start latent binding in scheme. For instance, all
the <gparam-foo> classes, and <gchar> and such. We need to develop a
nice syntax for that, and a clean infrastructure so we're not using
internal SCM api, whatever that is.

We could try to tune the GC so that the initial threshold for GC is
higher, so as to avoid unneccessary mark/sweeps when you're going to
have to extend the memory anyway. Dunno.

Improvements on guile's side
----------------------------
Since the symbols are interned, it seems stupid to me to have to dup the
strings. There ought to be a way to define symbols statically, like
SCM_MAKINUM. Guile devs probably won't like it, though, citing
SCM-opacity or whatever. But it could make a large difference to us. We
need to ask guile-devel about this.

(gnome gobject) is a big target, too. But I don't see it improving on
our side, because all the work it does is just defining objects and
methods. Give it a good audit, sure, but GOOPS simply needs to be
faster. That means implementing the method cache in C, and fully
implementing the MOP so generics can act more reliably.

Guile takes too long to load: 0.215s, whereas python only takes
something like 0.100 on my box. Part of the problem is that it defines
too much (like the networking code), and part of it is that boot-9 is
long, and implemented mostly in scheme. module-make-local-var!, for
instance, should be implemented in C, I think. The difference in the
flat profiles of (gnome gobject) and (gnome gw gtk) shows that 0.100
seconds more time is spent in scm_ceval for (gnome gw gtk) than (gnome
gobject), where really (in an efficient setup) we shouldn't be entering
the evaluator at all.


Conclusion
==========

I plucked the low-hanging fruit, bringing load times down from 13
seconds to 2. Cutting load time by another second is going to be more
difficult, with smaller gains, but all of Guile will benefit.

-- 
Andy Wingo <address@hidden>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]