texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Texmacs-dev] Performance questions, proposals, patches


From: Josef Weidendorfer
Subject: [Texmacs-dev] Performance questions, proposals, patches
Date: Wed, 13 Oct 2004 23:52:09 +0200
User-agent: KMail/1.7.1

Hi,

I wonder why with GCC >=2.96 on Linux/FreeBSD, the default compilation flags 
for texmacs include "-fno-default-inline -fno-inline"?
At least here, if compiling with inlining, the code gains at least 25% speedup 
without any negative effects.

In class hashmap_rep, there is the following member:
  int max;                   // mean number of entries per key
IMHO, hash tables should avoid conflicts for fast lookup, and thus, the mean 
number of entries per key should always be <1, perhaps 0.9 with a good hash 
function. Why is "max" an integer? I would propose for max being a float, and 
the default something like 0.8.
The hash function for hyphen_table (Resource/Languages/hyphenate.hpp) seems to 
be bad: e.g. when loading the english full user manual, you have 1,2 million 
lookups and 4,5 million compares. I suggest here a max value of 0.2.

I found the shrink function taking much time when starting scrolling in a new 
document, and looking at it, there is an easy way for optimization. The get_1
(x,y) can be moved out from the inner two loops, and as get_1() either gives 
back 0 or 1, you can skip these loops for 0.
============================================================
--- /home/weidendo/texmacs/src/src/Resource/Bitmap_fonts/glyph_shrink.cpp   
2004-04-03 13:15:33.000000000 +0200
+++ src/Resource/Bitmap_fonts/glyph_shrink.cpp  2004-10-13 01:07:25.713691104 
+0200
@@ -178,17 +178,17 @@ shrink (glyph gl, int xfactor, int yfact
   SI  off_y = (((Y2-1)*yfactor- dy)*PIXEL - ((ty*PIXEL)>>1))/yfactor;

   int i, j, x, y;
-  int index, indey;
+  int index, indey, entry;
   int ww=(X2-X1)*xfactor, hh=(Y2-Y1)*yfactor;
   STACK_NEW_ARRAY (bitmap, int, ww*hh);
   for (i=0; i<ww*hh; i++) bitmap[i]=0;
   for (y=0, index= ww*frac_y+ frac_x; y<gl->height; y++, index-=ww)
     for (x=0; x<gl->width; x++)
-      for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww)
-       for (i=0; i<=tx; i++) {
-         int entry= index+indey+x+i;
-         int value= gl->get_1(x,y);
-         if (value>bitmap[entry]) bitmap[entry]= value;
+      if (gl->get_1(x,y))
+       for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww) {
+         entry = index+indey+x;
+         for (i=0; i<=tx; i++, entry++)
+           bitmap[entry] = 1;
        }

   int X, Y, sum, nr= xfactor*yfactor;
=======================================================

Similar, for EPS images, encapsulate_postscript() takes quite some time: only 
to get rid of "showpage", it creates a temporary string object for every char 
in the EPS. Better put a check for a 's' char before, and append into the 
result string in large chunks:

================================================================
--- /home/weidendo/texmacs/src/src/Plugins/Ghostscript/ghostscript.cpp      2
003-10-24 12:43:48.000000000 +0200
+++ src/Plugins/Ghostscript/ghostscript.cpp     2004-10-04 12:19:11.303137496 
+0200
@@ -43,11 +43,18 @@ ghostscript_bugged () {
 static string
 encapsulate_postscript (string s) {
   int i, n=N(s);
+  int last_begin = 0;
   string r;
-  for (i=0; i<n; ) {
-    if ((i<(n-8)) && (s(i,i+8)=="showpage")) {i+=8; continue;}
-    r << s[i++];
+  for (i=0; i<n; i++) {
+    if ((s[i] != 's') ||
+       (i>(n-8)) ||
+       (s(i,i+8) != "showpage")) continue;
+    // found "showpage" at i
+    if (i>last_begin) r << s(last_begin, i);
+    i += 8;
+    last_begin = i;
   }
+  if (n>last_begin) r << s(last_begin, n);
   return r;
 }
==============================================

A strange thing are segmentation faults with deep function recursions on my 
machine: With TeXmacs 1.0.4.2, I get a segfault at the end of loading the 
user manual (because of the huge number of rectangles to update on screen?),
in typesetter_rep::requires_updates(), at stack frame around 7900 (using GDB).
This is exactly on a megabyte-boundary in the virtual adress space of the 
stack. Perhaps that's a kernel bug (Suse 9.1, 2.6.5), but IMHO at least the 
following function is worth converting to an iterative version:

=================================================
--- /home/weidendo/texmacs/src/src/Classes/Atomic/rectangles.cpp    2003-10-2
4 12:43:43.000000000 +0200
+++ src/Classes/Atomic/rectangles.cpp   2004-10-05 11:12:05.207071984 +0200
@@ -199,9 +199,24 @@ simplify (rectangles l) {
 rectangle
 least_upper_bound (rectangles l) {
   if (nil (l)) fatal_error ("no rectangles in list", "least_upper_bound");
-   if (atom (l)) return l->item;
-   rectangle r1= l->item;
-   rectangle r2= least_upper_bound (l->next);
-   return rectangle (min (r1->x1, r2->x1), min (r1->y1, r2->y1),
-                    max (r1->x2, r2->x2), max (r1->y2, r2->y2));
+  rectangle r1 = l->item;
+  while(!nil(l->next)) {
+    l = l->next;
+    rectangle r2 = l->item;
+    if (r2 == rectangle(0,0,0,0)) continue;
+    r1->x1 = min (r1->x1, r2->x1);
+    r1->y1 = min (r1->y1, r2->y1);
+    r1->x2 = max (r1->x2, r2->x2);
+    r1->y2 = max (r1->y2, r2->y2);
+  }
+  return r1;
 }
===============================

I changed the semantic a bit to ignore invalid/emtpy rectangles. This makes it 
possible to change the function typesetter_rep::typeset() which is using it 
by getting rid of the call to requires_update() alltogether: 
requires_updates() only removes these invalid rectangles.

=================================
--- /home/weidendo/texmacs/src/src/Typeset/Bridge/typesetter.cpp    2004-07-2
5 12:34:06.000000000 +0200
+++ src/Typeset/Bridge/typesetter.cpp   2004-10-05 10:50:41.768184352 +0200
@@ -164,7 +164,7 @@ typesetter_rep::typeset (SI& x1b, SI& y1
   box b= typeset ();
   // cout << 
"-------------------------------------------------------------\n";
   b->position_at (0, 0, change_log);
-  change_log= requires_update (change_log);
   rectangle r (0, 0, 0, 0);
   if (!nil (change_log)) r= least_upper_bound (change_log);
   x1b= r->x1; y1b= r->y1; x2b= r->x2; y2b= r->y2;
===================================

Calls to XCheckMaskEvent in box_rep::redraw() seem to be costly: Obviously, 
the event queue is getting huge on an update (Estimation: 10000 entries on 
average per call. Why?). When commenting out the following line:

 box_rep::redraw (ps_device dev, path p, rectangles& l) {
-  if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;
+  //if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;

the CPU load for texmacs in top is cut in half while keeping the system busy 
with scrolling: texmacs time is going down from 12% to 6%, X is taking the 
rest. Of course, this prohibits breaking out of a screen update e.g. on fast 
scrolling. So better use (nr_painted>=100) ?

Thats for now ;-)
Any comments?

Cheers,
Josef




reply via email to

[Prev in Thread] Current Thread [Next in Thread]