glob2-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [glob2-devel] Optimization suggestions (Map::updateLocalGradient())


From: Bradley Arsenault
Subject: Re: [glob2-devel] Optimization suggestions (Map::updateLocalGradient())
Date: Tue, 4 Sep 2007 17:10:30 -0400

On 9/4/07, Erik Søe Sørensen <address@hidden> wrote:
Hello,
I have been doing some profiling on Globulation 2, and found some useful
results.
(I began doing this because the Ubuntu release (0.8.21) had periods of
100% CPU activity, but that seems to have been fixed :-)

Map::updateLocalGradient() seems to be a bottleneck (~40% of the time
spent there).
In this function, there are at least two opportunities for speed
improvement.

For 21% improvement (of the function's run time), get rid of the array
introduced in Map.cpp:3889:
/----
--- Map.cpp.org  2007-09-04 05:20:41.000000000 +0200
+++ Map.cpp.erk  2007-09-04 05:23:52.000000000 +0200
@@ -3886,22 +3886,17 @@
                  else
                    xr=x+1;

-                Uint8 side[8];
-                side[0]=gradient[wyu+xl];
-                side[1]=gradient[wyu+x ];
-                side[2]=gradient[wyu+xr];
-
-                side[3]=gradient[wy +xr];
-
-                side[4]=gradient[wyd+xr];
-                side[5]=gradient[wyd+x ];
-                side[6]=gradient[wyd+xl];
-
-                side[7]=gradient[wy +xl];
-
-                for (int i=0; i<8; i++)
-                  if (side[i]>max)
-                    max=side[i];
+#define UPDATE_MAX(xy) {Uint8 tmp = gradient[(xy)]; if (tmp>max) max=tmp;}
+                UPDATE_MAX(wyu+xl);
+                UPDATE_MAX(wyu+x );
+                UPDATE_MAX(wyu+xr);
+                UPDATE_MAX(wy +xr);
+                UPDATE_MAX(wyd+xr);
+                UPDATE_MAX(wyd+x );
+                UPDATE_MAX(wyd+xl);
+                UPDATE_MAX(wy +xl);
+#undef UPDATE_MAX
+
                  assert(max);
                  if (max==1)
                    gradient[wy+x]=1;
\----

For further ~9% improvement, avoid copying a Case structure in
Map.cpp :3369, still in updateLocalGradient():
/----
--- Map.cpp.org 2007-09-04 05:20:41.000000000 +0200
+++ Map.cpp.erk  2007-09-04 05:28:23.000000000 +0200
@@ -3366,7 +3366,7 @@
          for (int xl=0; xl<32; xl++)
          {
            int xg=(xl+posX-15)&wMask;
-           Case c=cases[wyg+xg];
+           Case& c=cases[wyg+xg];
            int addrl=wyl+xl;
            if (gradient[addrl]!=255)
            {
\----
(This happens at several other locations, by the way... the second-worst
place in Map.cpp appears to be in Map::updateRessourcesGradient(), and
it'might also be worth the attention :-)

And another thing, now we're at reading code:
A couple of places in Map::updateLocalGradient(), there is code like this:
         if (xxi<0)
            xxi=0;
        else if (xxi>31)
            xxi=31;
       if (yyi<0)
            yyi=0;
        else if (yyi>31)
            xxi=31; // <-- Shouldn't this be yyi??
That last line does look like a typo (or more likely a copy'n'paste error?).

Regards,
Erik Søe Sørensen

PS: Great looking game, by the way! :-)


[Profiling done on a ThinkPad, CPU: Intel(R) Pentium(R) M processor
1300MHz stepping 05]
[Today's lesson: Don't begin profiling on a non-optimized build.]


Thats very impressive, I myself have done extensive optimization, but I didn't tackle the main gradient algorithms, mostly I did my own ones that are used in the Echo AI system. I couldn't make my algorithm any faster, so instead I distributed the algorithm over multiple frames, which makes echo, while less responsive, more smooth.

Anyhow, coincidentally, today I was actually working out how to optimize gradient algorithms in a more computer-science type way. We can do one or the other of the following, but both of them require keeping track of changes to the map.

1) We initialize the gradient for every turn, when most of this information is repeated. We could save a small amount of time by instead keeping an ongoing initial value, updating it instead when the values themselves are updated. Since there are two different initialization routines (for different kinds of paths), we need to keep two of these. When doing the gradient itself, instead of going through the entire map filtering changing etc.. We can just do straight copy of the values, which should be faster than accessing the whole map.

2) Alternatively, we can do a continuous update system, rather than a complete refresh. Basically, we keep track of changes on the map, where a change is from a path obstacle to a non obstacle, or vice versa.

a) for each obstacle->non obstacle, at that position, take the value of the smallest neighbor, then propagate that value like the normal algorithm does, except this algorithm only propagates into cells that will change with this prorogation. This means that, most of the time, the algorithm will quit quickly.

b) for each non obstacle->obstacle, mark each neighbor that has a higher value than the cell had before it changed as 0. Propagate into these cells, doing the same thing, propagating into each neighbor that has a higher value, and leaving behind the new values of 0. This is basically marking the region that is affected by the new obstacle. Each neighbor that had an equal or lower value was marked as a source and added to a list (or set, so duplicates don't occur). After this portion of the algorithm exits, the next portion fills in the damaged portion of the map by propagating from the sources like a normal gradient algorithm would.

These two updates don't have to be done in parallel, one can do a, then b or vice versa it won't affect the result. The danger here is that the same value may be touched many times, but in reality, the glob2 world doesn't change much between updates, so may improve performance.

--
Extra cheese comes at a cost. Bradley Arsenault.
reply via email to

[Prev in Thread] Current Thread [Next in Thread]