[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[glob2-devel] More Map code improvements
From: |
Erik Søe Sørensen |
Subject: |
[glob2-devel] More Map code improvements |
Date: |
Wed, 05 Sep 2007 12:32:53 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; da; rv:1.7.13) Gecko/20060717 Debian/1.7.12-1.1ubuntu2 StumbleUpon/1.9995 |
Erik Søe Sørensen skrev:
Joe Wells skrev:
But first I'll have to send you the latest results - I got an
additional 34% speed increase in that function today (and simplified
code in several places)...
- I'll be back, presumably tomorrow.
Which is now...
Martin Voelkle wrote:
> Thank you, I have commited these fixes to the master branch.
Hm, I don't see them. But of course I don't know Mercurial so well (did
try to update, though).
This means that the attached patch contains the changes I've mentioned
earlier, as well as some new stuff.
Martin Voelkle wrote:
[snip - about possible typo]
I changed it anyway.
Better yet, why not just get rid of it? :-)
Less code means less typo-prone code... (Especially when so many bits
are repeated.)
I've shortened the code several places - updateLocalGradient(), for
instance, was 300 lines - a bit on the long side. Better now, I hope (222).
The changes are:
- Copying a Case struct has been avoided (6 places).
- Building an array, then looping over it just to find the max, has been
avoided (9 places, including one hotspot). I added a simple macro
(UPDATE_MAX) to facilitate this.
- Introduced clip_0_31() for clipping a value to the range [0;31]. With
this, we avoid several near-idential 4-line if statements (12 places).
- Refactored updateLocalGradient() a bit by extracting functions
fillGradientCircle() and fillGradientRectangle().
- Reworked inner³-loop hotspot. Besides avoiding making an array,
there's less testing involved - and the resulting code is shorter (in
source as well as in code size, I think).
That's all, I think.
Total estimated time reduction for ULG: 37% :-)
I might look into algorithmic improvements next; I just hope that I've
understood the purpose correctly. Oh, lookit that, there's some
documentation on gradients - I'd better go read that then.
[Profiling conditions:
Intel(R) Pentium(R) M processor 1300MHz stepping 5;
gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4);
Linux #2 Thu Jun 7 20:16:13 UTC 2007 i686 GNU/Linux]
/Erik
--- src/Map.cpp.org 2007-09-05 10:18:25.000000000 +0200
+++ src/Map.cpp 2007-09-05 10:28:06.000000000 +0200
@@ -34,6 +34,7 @@
#include <map>
#endif
+#define UPDATE_MAX(max,value) {Uint8 tmp = value; if (value>(max))
(max)=value;}
// use deltaOne for first perpendicular direction
static const int deltaOne[8][2]={
@@ -2871,7 +2872,7 @@
assert(globalContainer);
for (size_t i=0; i<size; i++)
{
- Case c=cases[i];
+ Case& c=cases[i];
if ((c.forbidden|c.hiddenForbidden)&teamMask)
gradient[i]=0;
else if (c.ressource.type==NO_RES_TYPE)
@@ -2911,60 +2912,44 @@
if (max && max!=255)
{
max=1;
- Uint8 side[5];
- side[0]=miniGrad[0+2*5];
- side[1]=miniGrad[0+1*5];
- side[2]=miniGrad[0+0*5];
- side[3]=miniGrad[1+0*5];
- side[4]=miniGrad[2+0*5];
- for (int i=0; i<5; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[0+2*5]);
+ UPDATE_MAX(max,miniGrad[0+1*5]);
+ UPDATE_MAX(max,miniGrad[0+0*5]);
+ UPDATE_MAX(max,miniGrad[1+0*5]);
+ UPDATE_MAX(max,miniGrad[2+0*5]);
}
maxs[0]=(max<<8)|mxd;
max=mxd=miniGrad[3+1*5];
if (max && max!=255)
{
max=1;
- Uint8 side[5];
- side[0]=miniGrad[2+0*5];
- side[1]=miniGrad[3+0*5];
- side[2]=miniGrad[4+0*5];
- side[3]=miniGrad[4+1*5];
- side[4]=miniGrad[4+2*5];
- for (int i=0; i<5; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[2+0*5]);
+ UPDATE_MAX(max,miniGrad[3+0*5]);
+ UPDATE_MAX(max,miniGrad[4+0*5]);
+ UPDATE_MAX(max,miniGrad[4+1*5]);
+ UPDATE_MAX(max,miniGrad[4+2*5]);
}
maxs[1]=(max<<8)|mxd;
max=mxd=miniGrad[3+3*5];
if (max && max!=255)
{
max=1;
- Uint8 side[5];
- side[0]=miniGrad[4+2*5];
- side[1]=miniGrad[4+3*5];
- side[2]=miniGrad[4+4*5];
- side[3]=miniGrad[3+4*5];
- side[4]=miniGrad[2+4*5];
- for (int i=0; i<5; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[4+2*5]);
+ UPDATE_MAX(max,miniGrad[4+3*5]);
+ UPDATE_MAX(max,miniGrad[4+4*5]);
+ UPDATE_MAX(max,miniGrad[3+4*5]);
+ UPDATE_MAX(max,miniGrad[2+4*5]);
}
maxs[2]=(max<<8)|mxd;
max=mxd=miniGrad[1+3*5];
if (max && max!=255)
{
max=1;
- Uint8 side[5];
- side[0]=miniGrad[2+4*5];
- side[1]=miniGrad[1+4*5];
- side[2]=miniGrad[0+4*5];
- side[3]=miniGrad[0+3*5];
- side[4]=miniGrad[0+2*5];
- for (int i=0; i<5; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[2+4*5]);
+ UPDATE_MAX(max,miniGrad[1+4*5]);
+ UPDATE_MAX(max,miniGrad[0+4*5]);
+ UPDATE_MAX(max,miniGrad[0+3*5]);
+ UPDATE_MAX(max,miniGrad[0+2*5]);
}
maxs[3]=(max<<8)|mxd;
@@ -2973,52 +2958,36 @@
if (max && max!=255)
{
max=1;
- Uint8 side[3];
- side[0]=miniGrad[1+0*5];
- side[1]=miniGrad[2+0*5];
- side[2]=miniGrad[3+0*5];
- for (int i=0; i<3; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[1+0*5]);
+ UPDATE_MAX(max,miniGrad[2+0*5]);
+ UPDATE_MAX(max,miniGrad[3+0*5]);
}
maxs[4]=(max<<8)|mxd;
max=mxd=miniGrad[3+2*5];
if (max && max!=255)
{
max=1;
- Uint8 side[3];
- side[0]=miniGrad[4+1*5];
- side[1]=miniGrad[4+2*5];
- side[2]=miniGrad[4+3*5];
- for (int i=0; i<3; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[4+1*5]);
+ UPDATE_MAX(max,miniGrad[4+2*5]);
+ UPDATE_MAX(max,miniGrad[4+3*5]);
}
maxs[5]=(max<<8)|mxd;
max=mxd=miniGrad[2+3*5];
if (max && max!=255)
{
max=1;
- Uint8 side[3];
- side[0]=miniGrad[1+4*5];
- side[1]=miniGrad[2+4*5];
- side[2]=miniGrad[3+4*5];
- for (int i=0; i<3; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[1+4*5]);
+ UPDATE_MAX(max,miniGrad[2+4*5]);
+ UPDATE_MAX(max,miniGrad[3+4*5]);
}
maxs[6]=(max<<8)|mxd;
max=mxd=miniGrad[1+2*5];
if (max && max!=255)
{
max=1;
- Uint8 side[3];
- side[0]=miniGrad[0+1*5];
- side[1]=miniGrad[0+2*5];
- side[2]=miniGrad[0+3*5];
- for (int i=0; i<3; i++)
- if (side[i]>max)
- max=side[i];
+ UPDATE_MAX(max,miniGrad[0+1*5]);
+ UPDATE_MAX(max,miniGrad[0+2*5]);
+ UPDATE_MAX(max,miniGrad[0+3*5]);
}
maxs[7]=(max<<8)|mxd;
@@ -3296,6 +3265,40 @@
}
}
+
+
+/** Helper for updateLocalGradient, and others */
+int clip_0_31(int x) {return (x<0)? 0 : (x>31)? 31 : x;}
+
+/** Helper for updateLocalGradient */
+void fillGradientCircle(Uint8* gradient, int r) {
+ int r2=r*r;
+ for (int yi=-r; yi<=r; yi++)
+ {
+ int yi2=yi*yi;
+ int yyi=clip_0_31(15+yi);
+ for (int xi=-r; xi<=r; xi++)
+ if (yi2+(xi*xi)<r2)
+ {
+ int xxi=clip_0_31(15+xi);
+ gradient[xxi+(yyi<<5)]=255;
+ }
+ }
+}
+
+/** Helper for updateLocalGradient */
+void fillGradientRectangle(Uint8* gradient, int posW, int posH) {
+ for (int dy=0; dy<posH; dy++) {
+ int yyi=clip_0_31(15+dy);
+ for (int dx=0; dx<posW; dx++)
+ {
+ int xxi=clip_0_31(15+dx);
+ gradient[xxi+(yyi<<5)]=255;
+ }
+ }
+}
+
+
void Map::updateLocalGradient(Building *building, bool canSwim)
{
localBuildingGradientUpdate++;
@@ -3318,43 +3321,10 @@
{
assert(!building->type->zonableForbidden);
int r=building->unitStayRange;
- int r2=r*r;
- for (int yi=-r; yi<=r; yi++)
- {
- int yi2=yi*yi;
- for (int xi=-r; xi<=r; xi++)
- if (yi2+(xi*xi)<r2)
- {
- int xxi=15+xi;
- int yyi=15+yi;
- if (xxi<0)
- xxi=0;
- else if (xxi>31)
- xxi=31;
- if (yyi<0)
- yyi=0;
- else if (yyi>31)
- yyi=31;
- gradient[xxi+(yyi<<5)]=255;
- }
- }
+ fillGradientCircle(gradient, r);
}
else
- for (int dy=0; dy<posH; dy++)
- for (int dx=0; dx<posW; dx++)
- {
- int xxi=15+dx;
- int yyi=15+dy;
- if (xxi<0)
- xxi=0;
- else if (xxi>31)
- xxi=31;
- if (yyi<0)
- yyi=0;
- else if (yyi>31)
- yyi=31;
- gradient[xxi+(yyi<<5)]=255;
- }
+ fillGradientRectangle(gradient, posW, posH);
// Here g=Global(map axis), l=Local(map axis)
@@ -3385,28 +3355,7 @@
if (building->type->zonable[WORKER])
{
int r=building->unitStayRange;
- int r2=r*r;
- for (int yi=-r; yi<=r; yi++)
- {
- int yi2=yi*yi;
- for (int xi=-r; xi<=r; xi++)
- {
- if (yi2+(xi*xi)<=r2)
- {
- int xxi=15+xi;
- int yyi=15+yi;
- if (xxi<0)
- xxi=0;
- else if (xxi>31)
- xxi=31;
- if (yyi<0)
- yyi=0;
- else if (yyi>31)
- yyi=31;
- gradient[xxi+(yyi<<5)]=255;
- }
- }
- }
+ fillGradientCircle(gradient, r);
}
if (!building->type->isVirtual)
@@ -3430,6 +3379,7 @@
building->locked[canSwim]=false;
goto doubleBreak;
}
+
switch (ai)
{
case 0:
@@ -3497,51 +3447,26 @@
assert(y<32);
int wy=(y<<5);
- int wyu, wyd;
- if (y==0)
- wyu=0;
- else
- wyu=((y-1)<<5);
- if (y==31)
- wyd=32*31;
- else
- wyd=((y+1)<<5);
Uint8
max=gradient[wy+x];
if (max && max!=255)
{
- int xl, xr;
- if (x==0)
- xl=0;
- else
- xl=x-1;
- if (x==31)
- xr=31;
- else
- xr=x+1;
-
- Uint8 side[8];
-
side[0]=gradient[wyu+xl];
-
side[1]=gradient[wyu+x ];
-
side[2]=gradient[wyu+xr];
-
-
side[3]=gradient[wy +xr];
-
-
side[4]=gradient[wyd+xr];
-
side[5]=gradient[wyd+x ];
-
side[6]=gradient[wyd+xl];
-
-
side[7]=gradient[wy +xl];
-
- for (int i=0;
i<8; i++)
- if
(side[i]>max)
-
max=side[i];
+ for (int
dy=-32; dy<=32; dy+=32) {
+ int
ypart = wy+dy;
+ if
(ypart & (32*32)) continue; // Over- or underflow
+ for
(int dx=-1; dx<=1; dx++) {
+
int xpart = x+dx;
+
if (xpart & 32) continue; // Over- or underflow
+
UPDATE_MAX(max,gradient[ypart+xpart]);
+ }
+ }
+
assert(max);
if (max==1)
gradient[wy+x]=1;
else
gradient[wy+x]=max-1;
}
-
+
if (bi==0)
{
switch (ai)
@@ -3652,7 +3577,7 @@
for (int x=0; x<w; x++)
{
int wyx=wy+x;
- Case c=cases[wyx];
+ Case& c=cases[wyx];
if (c.building==NOGBID)
{
if (c.ressource.type!=NO_RES_TYPE)
@@ -4001,16 +3926,8 @@
{
int ddx, ddy;
Unit::dxdyfromDirection(d, &ddx, &ddy);
- int lxddx=lx+ddx;
- if (lxddx<0)
- lxddx=0;
- else if(lxddx>31)
- lxddx=31;
- int lyddy=ly+ddy;
- if (lyddy<0)
- lyddy=0;
- else if(lyddy>31)
- lyddy=31;
+ int lxddx=clip_0_31(lx+ddx);
+ int lyddy=clip_0_31(ly+ddy);
Uint8 g=gradient[lxddx+(lyddy<<5)];
if (g>1)
{
@@ -4044,16 +3961,8 @@
{
int ddx, ddy;
Unit::dxdyfromDirection(d, &ddx, &ddy);
- int lxddx=lx+ddx;
- if (lxddx<0)
- lxddx=0;
- else if(lxddx>31)
- lxddx=31;
- int lyddy=ly+ddy;
- if (lyddy<0)
- lyddy=0;
- else if(lyddy>31)
- lyddy=31;
+ int lxddx=clip_0_31(lx+ddx);
+ int lyddy=clip_0_31(ly+ddy);
Uint8 g=gradient[lxddx+(lyddy<<5)];
if (g>1)
{
@@ -4388,16 +4297,8 @@
{
int ddx, ddy;
Unit::dxdyfromDirection(d, &ddx, &ddy);
- int lxddx=lx+ddx;
- if (lxddx<0)
- lxddx=0;
- else if(lxddx>31)
- lxddx=31;
- int lyddy=ly+ddy;
- if (lyddy<0)
- lyddy=0;
- else if(lyddy>31)
- lyddy=31;
+ int lxddx=clip_0_31(lx+ddx);
+ int lyddy=clip_0_31(ly+ddy);
Uint8 g=gradient[lxddx+(lyddy<<5)];
if (!gradientUsable && g>currentg &&
isHardSpaceForGroundUnit(x+ddx, y+ddy, canSwim, teamMask))
gradientUsable=true;
@@ -4450,16 +4351,8 @@
{
int ddx, ddy;
Unit::dxdyfromDirection(d, &ddx, &ddy);
- int lxddx=lx+ddx;
- if (lxddx<0)
- lxddx=0;
- else if(lxddx>31)
- lxddx=31;
- int lyddy=ly+ddy;
- if (lyddy<0)
- lyddy=0;
- else if(lyddy>31)
- lyddy=31;
+ int lxddx=clip_0_31(lx+ddx);
+ int lyddy=clip_0_31(ly+ddy);
Uint8 g=gradient[lxddx+(lyddy<<5)];
if (!gradientUsable && g>currentg &&
isHardSpaceForGroundUnit(x+ddx, y+ddy, canSwim, teamMask))
gradientUsable=true;
@@ -4663,7 +4556,7 @@
assert(gradient);
for (size_t i = 0; i < size; i++)
{
- Case c = cases[i];
+ const Case& c = cases[i];
if ((c.ressource.type != NO_RES_TYPE) || (c.building!=NOGBID)
|| (!canSwim && isWater(i)))
{
gradient[i] = 0;
@@ -4723,7 +4616,7 @@
// We set the obstacle and free places
for (size_t i=0; i<size; i++)
{
- Case c=cases[i];
+ const Case& c=cases[i];
if (c.ressource.type!=NO_RES_TYPE)
testgradient[i] = 0;
else if (c.building!=NOGBID)
@@ -4788,7 +4681,7 @@
for (size_t i=0; i<size; i++)
{
- Case c=cases[i];
+ const Case& c=cases[i];
if (c.ressource.type!=NO_RES_TYPE)
gradient[i] = 0;
else if (c.building!=NOGBID)
@@ -4843,7 +4736,7 @@
Uint32 teamMask = Team::teamNumberToMask(teamNumber);
for (size_t i=0; i<size; i++)
{
- Case c=cases[i];
+ const Case& c=cases[i];
if (c.forbidden & teamMask || c.hiddenForbidden & teamMask)
gradient[i] = 0;
else if (c.ressource.type != NO_RES_TYPE)
- [glob2-devel] Optimization suggestions (Map::updateLocalGradient()), Erik Søe Sørensen, 2007/09/04
- Re: [glob2-devel] Optimization suggestions (Map::updateLocalGradient()), Kai Antweiler, 2007/09/05
- Re: [glob2-devel] Optimization suggestions, Erik Søe Sørensen, 2007/09/05
- Re: [glob2-devel] Optimization suggestions, Erik Søe Sørensen, 2007/09/05
- Re: [glob2-devel] Optimization suggestions, Kai Antweiler, 2007/09/05
- Re: [glob2-devel] Optimization suggestions, Bradley Arsenault, 2007/09/05