texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Cache profiling of TeXmacs 1.0.3.9


From: Joris van der Hoeven
Subject: Re: [Texmacs-dev] Cache profiling of TeXmacs 1.0.3.9
Date: Fri, 21 May 2004 10:47:05 +0200 (CEST)

>   1. Guile is crap, but we already knew that.
>      But TeXmacs does not help it much either. When working with the
>      hide-show plugin, which uses big trees, I have discovered that the
>      fact that that the size of tree objects which is reported to the
>      garbage collector is wrong (only the root node, not the whole tree)
>      can cause guile to spend all its time collecting, bringing TeXmacs
>      to a halt. I believe that happens because it does run out of
>      memory, but it does not increase the heap size appropriately since
>      it believe it manages much less memory than it does actually.

This seems to be a bug in Guile. I followed the Guile instructions
for reporting the size of trees (Guile "smobs").

>   2. TeXmacs has many cache misses, but they seem to happen all over the
>      code. So, the issue are either general design problems, or are
>      faults in commonly used data structures.

I do not understand the meaning of "many" here; I am not an expert on
cache issues; maybe someone can detail. From the statistics I got
the idea that there is approximately one cache miss for 1000 accesses
in TeXmacs (and 1/100 for Guile).

>      Here, my usual suspects are needless indirections. They might make
>      not a big difference. But that's something which should be tried.

Apart from the tree/array issue you mentioned, I do not see many
other points. I rather have the impression that it is the global
design of random memory access which is responsable for most
cache misses. We would need to compactify memory in order to
reduce that.

>   3. The top faulters are counted pointer constructors and destructors
>      (not real objects ctors and dtors, just pesky counted pointers)
>      and the memory allocator.

That might be; we get back to the const T& point.

>      Regarding the allocator, there is probably a lot of good
>      bibliography on that very non-trivial issue. The TeXmacs allocator
>      is reasonable, but it seems naive. Using pools for related objects
>      might help, but that may be difficult to implement and the result
>      are uncertain.

In that case, I would rather opt for a copying compactifier, which can be
launched in spare time. But I became a bit discouraged by your experiences
with GC to dig further into this. Nevertheless, I do think that a better
memory allocation scheme is possible. But it will probably be time
consuming to implement.

>   4. Another remarkable faulter is the string comparison.
>
>      The names of all typesetter variable should be interned, and
>      string objects could be either interned or self-contained. That
>      will add a boolen to string_rep, but that's probably worth the
>      cost. To make the best use of cache loads, interned strings should
>      be stored in a separate pool, whose size can be known
>      at compilation time.

Yes, this should certainly be considered sometime. In fact,
we might use symbols instead of strings as the labels of all trees.
That would make compairison much faster, even though usual string
operations like << on labels would become slower. Moreover,
the actual strings might be stored in a separate pool which is
compactified at spare time.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]