swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Optimization: why don't I get no speedup?


From: Paul E. Johnson
Subject: Optimization: why don't I get no speedup?
Date: Mon, 07 Jan 2002 12:58:49 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

While profiling an application written in Obj-C, I trace a bottleneck back to one specific method. It is a method that updates a lattice, and it works by beginning at a point, and then taking cells within a "radius" and updating them.

If the model does not "wrap around" the edges of the grid, there is some wasted effort, because the for loop checks each time to see if a cell is "out of bounds" and if it is out, then it "continues" and goes on to the next cell. I guessed that this was wasting time, because it uses a lot of steps to do nothing when the radius is very large, and tried to redesign the algorithm so that the boundaries of the radius are calculated first and then the loop iterates from edge to edge of the legitimate radius. This eliminates the need inside the loop to check to see if it is inside the grid and also cuts out visits to cells outside the grid. So I rewrote the alorithm, and got no speedup! Why? Are logic checks just so fast that algorithmically eliminating them has no impact? I have -O2 in the cflags, nothing special to try to speed things up. Maybe gcc3 is being my friend here?

I'm going to show the code snip below, but I'm just bringing this up to say that optimization is a drag when you get no speedup out of it! And to ask the musical question: How long do you stare at a method like this before you conclude that it is as fast as you can possibly make it?


Here is the "old method". The macro D2D is the same as the Swarm discrete2dsiteAt() macro in the Grid2d class. I just got tired of typing that over and over so relabeled it as D2D. The loop goes to cell (xx,yy) and then broadcasts out from there, looping over the radius around it.

- stepRule
{
 unsigned xx, yy;

 for (xx = 0; xx < xsize; xx++)
   for (yy = 0; yy < ysize; yy++)
     {
       int i, levelHere =0;
           //first flush through the population changes
      for ( i = 0; i < OPINIONS; i++)
       {
         if ( (levelHere=*D2D(popChange[i],gridOffsets,xx,yy)) != 0)
           {
             int r, c, xcoord, ycoord;
             BOOL outOfBoundsX=NO, outOfBoundsY=NO;
//broadcast that change across a square that goes "radius" units in all directions from xx,yy
             for(r=0;r<(2*radius)+1;r++) //r is for row
              {
               xcoord = xx+r-radius;
outOfBoundsX = (xcoord < 0)||(xcoord > xsize-1);//happens often if radius is big
              if (outOfBoundsX)
                 {
if (wrapAround == NO) continue; //if outside bounds, then quit here
                  else xcoord = (xcoord + xsize) % xsize;
                  }
               for(c=0;c<(2*radius)+1;c++) //c is for column
                {
                   ycoord= yy+c-radius;
                  outOfBoundsY = (ycoord < 0)||(ycoord > ysize-1);
                  if (outOfBoundsY)
                 {
if (wrapAround == NO) continue; //if outside bounds, then quit here
                     else
                     {
                        ycoord = (ycoord + ysize) % ysize;
                     }
                   }
*D2D(visibleOpinion[i],gridOffsets,xcoord,ycoord) += levelHere; }
           }
         }
     }


Here is the new faster more efficient code that is no damn faster!

- stepRule
{
 unsigned xx, yy;

 for (xx = 0; xx < xsize; xx++){
   int leftX = xx - radius;
   int rightX = xx + radius;
   if (wrapAround == NO)
      {
         if ( leftX < 0 ) leftX = 0;
          if ( rightX > xsize - 1 ) rightX = xsize - 1;
        }
     for (yy = 0; yy < ysize; yy++)
      {
         int i, levelHere =0;
         int lowY = yy - radius;
         int highY = yy + radius;
         if (wrapAround == NO)
          {
             if ( lowY < 0 ) lowY = 0;
             if ( highY > ysize -1) highY = ysize - 1;
           }
    //first flush through the population changes
    for ( i = 0; i < OPINIONS; i++)
      {
        if ( (levelHere = *D2D(popChange[i],gridOffsets,xx,yy)) != 0)
         {
           int xcoord, ycoord;
           if (wrapAround == NO)
          {
            for( xcoord = leftX; xcoord < rightX+1; xcoord++ )
              {
              for( ycoord=lowY; ycoord < highY+1; ycoord++)
                {
*D2D(visibleOpinion[i],gridOffsets,xcoord,ycoord) += levelHere;
                 }
               }
            }
         else
            {
               int r, c ; //r for row, c for column
              for(r = leftX; r < rightX+1; r++)
                {
                  xcoord = r;
if (r < 0) xcoord = xsize + r; //avoid use of slow (xcoord+xsize)%xsize!
                   else if (r > xsize - 1) xcoord = r - xsize; // ditto !
                 for (c = lowY; c < highY+1 ; c++)
                 {
                    ycoord = c;
if (c < 0) ycoord = ysize + c; //avoid use of slow (xcoord+xsize)%xsize! else if (c > ysize - 1) ycoord = c - ysize; // ditto ! *D2D(visibleOpinion[i],gridOffsets,xcoord,ycoord) += levelHere;
                  }
               }
             }
          }
      }




--
Paul E. Johnson                       email: address@hidden
Dept. of Political Science            http://lark.cc.ukans.edu/~pauljohn
University of Kansas                  Office: (785) 864-9086
Lawrence, Kansas 66045                FAX: (785) 864-5700



                 ==================================
  Swarm-Support is for discussion of the technical details of the day
  to day usage of Swarm.  For list administration needs (esp.
  [un]subscribing), please send a message to <address@hidden>
  with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]