[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-gsl] Questions about the code of some functions in cblas imple
From: |
Brian Gough |
Subject: |
Re: [Help-gsl] Questions about the code of some functions in cblas implementation |
Date: |
Tue, 23 Jun 2009 17:40:23 +0100 |
User-agent: |
Wanderlust/2.14.0 (Africa) Emacs/22.1 Mule/5.0 (SAKAKI) |
At Thu, 18 Jun 2009 23:42:57 +0200,
José Luis García Pallero wrote:
> No loop unrolling: 0.005 s
> Loop unrolling: 0.6 s
>
> for(i=0;i<n;i++)
> {
> a = i*i+i;
> }
> }
I think the program below is probably more realistic for this
case. Given the huge difference between the two results I suspect that
the compiler is able to overoptimise the simple case above. Maybe you
could compare this or the actual function.
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
int
main (int argc, char *argv[])
{
int n = 0, i = 0, j, m;
double *a, *x;
double t0, t1, t2;
double A = 3, B = 2;
n = atoi (argv[1]);
m = atoi (argv[1]);
a = malloc (sizeof (double) * n);
x = malloc (sizeof (double) * n);
t0 = clock ();
{
for (j = 0; j < m; j++)
for (i = 0; i < n; i++)
{
a[i] = A * x[i] + B;
}
}
t1 = clock ();
{
for (j = 0; j < m; j++)
for (i = 0; i < n; i += 4)
{
a[i] = A * x[i] + B;
a[i + 1] = A * x[i + 1] + B;
a[i + 2] = A * x[i + 2] + B;
a[i + 3] = A * x[i + 3] + B;
}
}
t2 = clock ();
printf ("operations = %g\n", (double) (n * m));
printf ("plain loop = %g\n", t1 - t0);
printf ("fancy loop = %g\n", t2 - t1);
return 0;
}