[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Texmacs-dev] Performance questions, proposals, patches
From: |
Josef Weidendorfer |
Subject: |
[Texmacs-dev] Performance questions, proposals, patches |
Date: |
Wed, 13 Oct 2004 23:52:09 +0200 |
User-agent: |
KMail/1.7.1 |
Hi,
I wonder why with GCC >=2.96 on Linux/FreeBSD, the default compilation flags
for texmacs include "-fno-default-inline -fno-inline"?
At least here, if compiling with inlining, the code gains at least 25% speedup
without any negative effects.
In class hashmap_rep, there is the following member:
int max; // mean number of entries per key
IMHO, hash tables should avoid conflicts for fast lookup, and thus, the mean
number of entries per key should always be <1, perhaps 0.9 with a good hash
function. Why is "max" an integer? I would propose for max being a float, and
the default something like 0.8.
The hash function for hyphen_table (Resource/Languages/hyphenate.hpp) seems to
be bad: e.g. when loading the english full user manual, you have 1,2 million
lookups and 4,5 million compares. I suggest here a max value of 0.2.
I found the shrink function taking much time when starting scrolling in a new
document, and looking at it, there is an easy way for optimization. The get_1
(x,y) can be moved out from the inner two loops, and as get_1() either gives
back 0 or 1, you can skip these loops for 0.
============================================================
--- /home/weidendo/texmacs/src/src/Resource/Bitmap_fonts/glyph_shrink.cpp
2004-04-03 13:15:33.000000000 +0200
+++ src/Resource/Bitmap_fonts/glyph_shrink.cpp 2004-10-13 01:07:25.713691104
+0200
@@ -178,17 +178,17 @@ shrink (glyph gl, int xfactor, int yfact
SI off_y = (((Y2-1)*yfactor- dy)*PIXEL - ((ty*PIXEL)>>1))/yfactor;
int i, j, x, y;
- int index, indey;
+ int index, indey, entry;
int ww=(X2-X1)*xfactor, hh=(Y2-Y1)*yfactor;
STACK_NEW_ARRAY (bitmap, int, ww*hh);
for (i=0; i<ww*hh; i++) bitmap[i]=0;
for (y=0, index= ww*frac_y+ frac_x; y<gl->height; y++, index-=ww)
for (x=0; x<gl->width; x++)
- for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww)
- for (i=0; i<=tx; i++) {
- int entry= index+indey+x+i;
- int value= gl->get_1(x,y);
- if (value>bitmap[entry]) bitmap[entry]= value;
+ if (gl->get_1(x,y))
+ for (j=0, indey=ww*ty; j<=ty; j++, indey-=ww) {
+ entry = index+indey+x;
+ for (i=0; i<=tx; i++, entry++)
+ bitmap[entry] = 1;
}
int X, Y, sum, nr= xfactor*yfactor;
=======================================================
Similar, for EPS images, encapsulate_postscript() takes quite some time: only
to get rid of "showpage", it creates a temporary string object for every char
in the EPS. Better put a check for a 's' char before, and append into the
result string in large chunks:
================================================================
--- /home/weidendo/texmacs/src/src/Plugins/Ghostscript/ghostscript.cpp 2
003-10-24 12:43:48.000000000 +0200
+++ src/Plugins/Ghostscript/ghostscript.cpp 2004-10-04 12:19:11.303137496
+0200
@@ -43,11 +43,18 @@ ghostscript_bugged () {
static string
encapsulate_postscript (string s) {
int i, n=N(s);
+ int last_begin = 0;
string r;
- for (i=0; i<n; ) {
- if ((i<(n-8)) && (s(i,i+8)=="showpage")) {i+=8; continue;}
- r << s[i++];
+ for (i=0; i<n; i++) {
+ if ((s[i] != 's') ||
+ (i>(n-8)) ||
+ (s(i,i+8) != "showpage")) continue;
+ // found "showpage" at i
+ if (i>last_begin) r << s(last_begin, i);
+ i += 8;
+ last_begin = i;
}
+ if (n>last_begin) r << s(last_begin, n);
return r;
}
==============================================
A strange thing are segmentation faults with deep function recursions on my
machine: With TeXmacs 1.0.4.2, I get a segfault at the end of loading the
user manual (because of the huge number of rectangles to update on screen?),
in typesetter_rep::requires_updates(), at stack frame around 7900 (using GDB).
This is exactly on a megabyte-boundary in the virtual adress space of the
stack. Perhaps that's a kernel bug (Suse 9.1, 2.6.5), but IMHO at least the
following function is worth converting to an iterative version:
=================================================
--- /home/weidendo/texmacs/src/src/Classes/Atomic/rectangles.cpp 2003-10-2
4 12:43:43.000000000 +0200
+++ src/Classes/Atomic/rectangles.cpp 2004-10-05 11:12:05.207071984 +0200
@@ -199,9 +199,24 @@ simplify (rectangles l) {
rectangle
least_upper_bound (rectangles l) {
if (nil (l)) fatal_error ("no rectangles in list", "least_upper_bound");
- if (atom (l)) return l->item;
- rectangle r1= l->item;
- rectangle r2= least_upper_bound (l->next);
- return rectangle (min (r1->x1, r2->x1), min (r1->y1, r2->y1),
- max (r1->x2, r2->x2), max (r1->y2, r2->y2));
+ rectangle r1 = l->item;
+ while(!nil(l->next)) {
+ l = l->next;
+ rectangle r2 = l->item;
+ if (r2 == rectangle(0,0,0,0)) continue;
+ r1->x1 = min (r1->x1, r2->x1);
+ r1->y1 = min (r1->y1, r2->y1);
+ r1->x2 = max (r1->x2, r2->x2);
+ r1->y2 = max (r1->y2, r2->y2);
+ }
+ return r1;
}
===============================
I changed the semantic a bit to ignore invalid/emtpy rectangles. This makes it
possible to change the function typesetter_rep::typeset() which is using it
by getting rid of the call to requires_update() alltogether:
requires_updates() only removes these invalid rectangles.
=================================
--- /home/weidendo/texmacs/src/src/Typeset/Bridge/typesetter.cpp 2004-07-2
5 12:34:06.000000000 +0200
+++ src/Typeset/Bridge/typesetter.cpp 2004-10-05 10:50:41.768184352 +0200
@@ -164,7 +164,7 @@ typesetter_rep::typeset (SI& x1b, SI& y1
box b= typeset ();
// cout <<
"-------------------------------------------------------------\n";
b->position_at (0, 0, change_log);
- change_log= requires_update (change_log);
rectangle r (0, 0, 0, 0);
if (!nil (change_log)) r= least_upper_bound (change_log);
x1b= r->x1; y1b= r->y1; x2b= r->x2; y2b= r->y2;
===================================
Calls to XCheckMaskEvent in box_rep::redraw() seem to be costly: Obviously,
the event queue is getting huge on an update (Estimation: 10000 entries on
average per call. Why?). When commenting out the following line:
box_rep::redraw (ps_device dev, path p, rectangles& l) {
- if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;
+ //if ((nr_painted>=10) && dev->check_event (INPUT_EVENT)) return;
the CPU load for texmacs in top is cut in half while keeping the system busy
with scrolling: texmacs time is going down from 12% to 6%, X is taking the
rest. Of course, this prohibits breaking out of a screen update e.g. on fast
scrolling. So better use (nr_painted>=100) ?
Thats for now ;-)
Any comments?
Cheers,
Josef
- [Texmacs-dev] Performance questions, proposals, patches,
Josef Weidendorfer <=