texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Performance questions, proposals, patches


From: Joris van der Hoeven
Subject: Re: [Texmacs-dev] Performance questions, proposals, patches
Date: Thu, 14 Oct 2004 13:08:26 +0200 (CEST)

Hi Josef,

On Wed, 13 Oct 2004, Josef Weidendorfer wrote:
> I wonder why with GCC >=2.96 on Linux/FreeBSD, the default compilation flags
> for texmacs include "-fno-default-inline -fno-inline"?
> At least here, if compiling with inlining, the code gains at least 25% speedup
> without any negative effects.

Certain versions of GCC 3.* are bugged and caused segmentation faults
in combination with inlining. Maybe we can put inlining in again for
the most recent version, if you did not notice any suspicious behaviour.
Any patch for configure.in is welcome.

> In class hashmap_rep, there is the following member:
>   int max;                   // mean number of entries per key
> IMHO, hash tables should avoid conflicts for fast lookup, and thus, the mean
> number of entries per key should always be <1, perhaps 0.9 with a good hash
> function. Why is "max" an integer? I would propose for max being a float, and
> the default something like 0.8.
> The hash function for hyphen_table (Resource/Languages/hyphenate.hpp) seems to
> be bad: e.g. when loading the english full user manual, you have 1,2 million
> lookups and 4,5 million compares. I suggest here a max value of 0.2.

You are definitely right; the basic data structures are quite old and
may still be optimized a bit further. I also noticed another possible
optimization for arrays (and strings): instead of allocating an array
of a size which depends on the size of the array (which is used for
the << operator), it might be better to systematically allocate an array
of the same size and only use over-allocation when the << operator
is explicitly used. This might reduce the memory requirements of TeXmacs
quite a lot.

> I found the shrink function taking much time when starting scrolling in a new
> document, and looking at it, there is an easy way for optimization. The get_1
> (x,y) can be moved out from the inner two loops, and as get_1() either gives
> back 0 or 1, you can skip these loops for 0.
> ============================================================
> --- /home/weidendo/texmacs/src/src/Resource/Bitmap_fonts/glyph_shrink.cpp
> 2004-04-03 13:15:33.000000000 +0200
> +++ src/Resource/Bitmap_fonts/glyph_shrink.cpp  2004-10-13 01:07:25.713691104
> +0200
> @@ -178,17 +178,17 @@ shrink (glyph gl, int xfactor, int yfact
>    SI  off_y = (((Y2-1)*yfactor- dy)*PIXEL - ((ty*PIXEL)>>1))/yfactor;
>
>    int i, j, x, y;
> -  int index, indey;
> +  int index, indey, entry;
>    int ww=(X2-X1)*xfactor, hh=(Y2-Y1)*yfactor;
>    STACK_NEW_ARRAY (bitmap, int, ww*hh);
>    for (i=0; i<ww*hh; i++) bitmap[i]=0;
>    for (y=0, index= ww*frac_y+ frac_x; y<gl->height; y++, index-=ww)
>      for (x=0; x<gl->width; x++)
> -      for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww)
> -       for (i=0; i<=tx; i++) {
> -         int entry= index+indey+x+i;
> -         int value= gl->get_1(x,y);
> -         if (value>bitmap[entry]) bitmap[entry]= value;
> +      if (gl->get_1(x,y))
> +       for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww) {
> +         entry = index+indey+x;
> +         for (i=0; i<=tx; i++, entry++)
> +           bitmap[entry] = 1;
>         }
>
>    int X, Y, sum, nr= xfactor*yfactor;
> =======================================================

Thanks for the patch; character shrinking indeed seemed to
account for some overhead.

> Similar, for EPS images, encapsulate_postscript() takes quite some time: only
> to get rid of "showpage", it creates a temporary string object for every char
> in the EPS. Better put a check for a 's' char before, and append into the
> result string in large chunks:
>
> ================================================================
> --- /home/weidendo/texmacs/src/src/Plugins/Ghostscript/ghostscript.cpp      2
> 003-10-24 12:43:48.000000000 +0200
> +++ src/Plugins/Ghostscript/ghostscript.cpp     2004-10-04 12:19:11.303137496
> +0200
> @@ -43,11 +43,18 @@ ghostscript_bugged () {
>  static string
>  encapsulate_postscript (string s) {
>    int i, n=N(s);
> +  int last_begin = 0;
>    string r;
> -  for (i=0; i<n; ) {
> -    if ((i<(n-8)) && (s(i,i+8)=="showpage")) {i+=8; continue;}
> -    r << s[i++];
> +  for (i=0; i<n; i++) {
> +    if ((s[i] != 's') ||
> +       (i>(n-8)) ||
> +       (s(i,i+8) != "showpage")) continue;
> +    // found "showpage" at i
> +    if (i>last_begin) r << s(last_begin, i);
> +    i += 8;
> +    last_begin = i;
>    }
> +  if (n>last_begin) r << s(last_begin, n);
>    return r;
>  }

You probably may use some of the testing routines in analyze.hpp
for this kind of purpose too.

> ==============================================
>
> A strange thing are segmentation faults with deep function recursions on my
> machine: With TeXmacs 1.0.4.2, I get a segfault at the end of loading the
> user manual (because of the huge number of rectangles to update on screen?),
> in typesetter_rep::requires_updates(), at stack frame around 7900 (using GDB).
> This is exactly on a megabyte-boundary in the virtual adress space of the
> stack. Perhaps that's a kernel bug (Suse 9.1, 2.6.5), but IMHO at least the
> following function is worth converting to an iterative version:
>
> =================================================
> --- /home/weidendo/texmacs/src/src/Classes/Atomic/rectangles.cpp    2003-10-2
> 4 12:43:43.000000000 +0200
> +++ src/Classes/Atomic/rectangles.cpp   2004-10-05 11:12:05.207071984 +0200
> @@ -199,9 +199,24 @@ simplify (rectangles l) {
>  rectangle
>  least_upper_bound (rectangles l) {
>    if (nil (l)) fatal_error ("no rectangles in list", "least_upper_bound");
> -   if (atom (l)) return l->item;
> -   rectangle r1= l->item;
> -   rectangle r2= least_upper_bound (l->next);
> -   return rectangle (min (r1->x1, r2->x1), min (r1->y1, r2->y1),
> -                    max (r1->x2, r2->x2), max (r1->y2, r2->y2));
> +  rectangle r1 = l->item;
> +  while(!nil(l->next)) {
> +    l = l->next;
> +    rectangle r2 = l->item;
> +    if (r2 == rectangle(0,0,0,0)) continue;
> +    r1->x1 = min (r1->x1, r2->x1);
> +    r1->y1 = min (r1->y1, r2->y1);
> +    r1->x2 = max (r1->x2, r2->x2);
> +    r1->y2 = max (r1->y2, r2->y2);
> +  }
> +  return r1;
>  }

No problem, I can do that. Is it really here that we get the stack overflow?
In that case, we might have to check why we do get such long lists of
rectangles.

> ===============================
>
> I changed the semantic a bit to ignore invalid/emtpy rectangles. This makes it
> possible to change the function typesetter_rep::typeset() which is using it
> by getting rid of the call to requires_update() alltogether:
> requires_updates() only removes these invalid rectangles.

I have to check this.

> =================================
> --- /home/weidendo/texmacs/src/src/Typeset/Bridge/typesetter.cpp    2004-07-2
> 5 12:34:06.000000000 +0200
> +++ src/Typeset/Bridge/typesetter.cpp   2004-10-05 10:50:41.768184352 +0200
> @@ -164,7 +164,7 @@ typesetter_rep::typeset (SI& x1b, SI& y1
>    box b= typeset ();
>    // cout <<
> "-------------------------------------------------------------\n";
>    b->position_at (0, 0, change_log);
> -  change_log= requires_update (change_log);
>    rectangle r (0, 0, 0, 0);
>    if (!nil (change_log)) r= least_upper_bound (change_log);
>    x1b= r->x1; y1b= r->y1; x2b= r->x2; y2b= r->y2;
> ===================================
>
> Calls to XCheckMaskEvent in box_rep::redraw() seem to be costly: Obviously,
> the event queue is getting huge on an update (Estimation: 10000 entries on
> average per call. Why?). When commenting out the following line:
>
>  box_rep::redraw (ps_device dev, path p, rectangles& l) {
> -  if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;
> +  //if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;
>
> the CPU load for texmacs in top is cut in half while keeping the system busy
> with scrolling: texmacs time is going down from 12% to 6%, X is taking the
> rest. Of course, this prohibits breaking out of a screen update e.g. on fast
> scrolling. So better use (nr_painted>=100) ?

A better solution might be to use something like

        if ((nr_painted%10 == 9) && dev->check_event (INPUT_EVENT)) return;

I need to check though whether nr_painted cannot be increased during
such interruptions of the painting process though.

Of course, it is quite important to test for keyboard interrupts during
the drawing phase, in order to keep typing reasonably fast, especially
on computer with a slow display or a remote connection.

> Thats for now ;-)
> Any comments?

Thanks a lot for all the work! I will apply your patches soon.

Best wishes, Joris





reply via email to

[Prev in Thread] Current Thread [Next in Thread]