freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freepooma-devel] another stupid thought


From: Roman Krylov
Subject: [Freepooma-devel] another stupid thought
Date: Mon, 11 Apr 2005 19:31:03 +0400
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

Hi again.
Does Array<N,Vector<n,T> > have regular memory alignment for it's data so that if I have a pointer to the first elment(of T type), I can calculate pointer to 2nd and so on?
It's all about autovect :)
   they say It supports strided access.
And maybe it'll be advantageous sometimes to make some operations like += or *= without PETE? These operations can not avoid temporaries with PETE, do I understand it right? So if I have N '+=' I have N {tmp = LHS + RHS; LHS=tmp} so if RHS is some complex expression which can benefit from PETE, if it won't be bunched with LHS, there'll be no big loss, but I can benefit from vectorizing first statement in {...}.

Thanks.
Roman.

P.S. Maybe it'll be good to add new mailing list for crazy ideas ? :) not to clutter freepooma-devel

Hi.
Thanks for multiple patches!
honestly I haven't tested it yet, I'm still using my hacked version of InterpolatorCIC.h
I'll check them somehow this week.
Performance enhancements mentioned earlier are attractive of course :)

And what about autovectorization: haven't you tried Devang Patel's '#pragma ivdep' patch in autovect-branch? When I tried it - some rejections emerged and no other information since then. But I even don't know what to anticipate from it - would it be able to vectorize operations on arrays of Vector<,>s? and, is the problem only in aliasing or there are some other obstacles? and would it be worth for operations with explicit L-R dependence like '+='?

By the way, about shortcoming I've encountered earlier: as you had mentioned, without indexes' boundaries checking the cellContaining() works, but SEGFAULTs. In addition: if position attributes are out of domain, but position[i]-origin (Vector<,> type) has all positive numbers, then there's no SEGFAULT, but strange behaviour occures and segfaults if one of position[i]-origin coordinate is <0. So there should be always some BC having coincident boundaries with mesh(as in PIC2d). And sync() or smth other invoking BCs should be before gather or scatter.

Thanks.
Roman.

On Mon, 11 Apr 2005, Richard Guenther wrote:

On Mon, 11 Apr 2005, Richard Guenther wrote:

On Fri, 1 Apr 2005, Richard Guenther wrote:

Hi!

This patch canonicalizes the handling of hasInternalGuards_m and
hasExternalGuards_m in the various grid layouts.  It also disables
optimizing away of internal guards if the partitioner will create
at most one patch.

I believe that the current way of doing things is correct (apart
from not all places being consistent), i.e. have hasInternalGuards
and guards() being consistent.  What the Interpolators really
are asking from the Layouts is the maximum supported stencil size,
a thing the layouts cannot answer atm.  For this we need to provide
functionality.


Like the attached patch.  Roman - does this work for you?

Richard.


2005Apr11  Richard Guenther <address@hidden>

    * src/Particles/Interpolation.h (getMaximumStencilWidth):
    New function.
    src/Particles/InterpolatorCIC.h: Use it, instead of relying
    on internal guards.
    src/Particles/InterpolatorSUDS.h: Likewise.
------------------------------------------------------------------------

? LINUXgcc
? LINUXgcc-opt
? tests/LINUXgcc
? tests/LINUXgcc-opt
Index: Interpolation.h
===================================================================
RCS file: /cvsroot/freepooma/freepooma/src/Particles/Interpolation.h,v
retrieving revision 1.10
diff -u -r1.10 Interpolation.h
--- Interpolation.h    1 Nov 2004 18:16:59 -0000    1.10
+++ Interpolation.h    11 Apr 2005 12:56:24 -0000
@@ -220,6 +220,25 @@
void setExternalGuards(const Field&, typename Field::Element_t);


+/// getMaximumStencilWidth returns the maximum extent a stencil may
+/// have if operating on the physical domain.
+
+template <class Layout>
+GuardLayers<Layout::dimensions> getMaximumStencilWidth(const Layout& l)
+{
+  GuardLayers<Layout::dimensions> gl = l.externalGuards();
+  if (l.sizeGlobal() > 1) {
+    for (int i=0; i<Layout::dimensions; ++i) {
+      if (l.internalGuards().lower(i) < gl.lower(i))
+    gl.lower(i) = l.internalGuards().lower(i);
+      if (l.internalGuards().upper(i) < gl.upper(i))
+    gl.upper(i) = l.internalGuards().upper(i);
+    }
+  }
+  return gl;
+}
+
+
#include "Particles/Interpolation.cpp"


Index: InterpolatorCIC.h
===================================================================
RCS file: /cvsroot/freepooma/freepooma/src/Particles/InterpolatorCIC.h,v
retrieving revision 1.12
diff -u -r1.12 InterpolatorCIC.h
--- InterpolatorCIC.h    1 Nov 2004 18:16:59 -0000    1.12
+++ InterpolatorCIC.h    11 Apr 2005 12:56:24 -0000
@@ -152,7 +152,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate guard layers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -188,7 +188,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate guard layers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -231,7 +231,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -281,7 +281,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -320,7 +320,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -365,7 +365,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -412,7 +412,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -449,7 +449,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -492,7 +492,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for CIC
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
Index: InterpolatorSUDS.h
===================================================================
RCS file: /cvsroot/freepooma/freepooma/src/Particles/InterpolatorSUDS.h,v
retrieving revision 1.12
diff -u -r1.12 InterpolatorSUDS.h
--- InterpolatorSUDS.h    1 Nov 2004 18:16:59 -0000    1.12
+++ InterpolatorSUDS.h    11 Apr 2005 12:56:24 -0000
@@ -153,7 +153,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate guard layers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -189,7 +189,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate guard layers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -232,7 +232,7 @@
"Field and Particle Position must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -282,7 +282,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -321,7 +321,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -366,7 +366,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -413,7 +413,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -450,7 +450,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
@@ -493,7 +493,7 @@
"Field and Particle CacheData must have same number of patches!");

      // Check that the Field has adequate GuardLayers for SUDS
-      const GuardLayers<Dim>& gl = field.layout().internalGuards();
+      GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
      for (int d=0; d<Dim; ++d)
        {
          PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
------------------------------------------------------------------------

_______________________________________________
Freepooma-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/freepooma-devel



_______________________________________________
Freepooma-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/freepooma-devel







reply via email to

[Prev in Thread] Current Thread [Next in Thread]