freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freepooma-devel] another stupid thought


From: Roman Krylov
Subject: Re: [Freepooma-devel] another stupid thought
Date: Mon, 11 Apr 2005 19:50:50 +0400
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

As about += in Array<N,Vector<n,T> > - I meant to do it for comp(0) then for comp(1) and so on only for specialized situations where T is some primitive type like double (whereas it can be another Vector<> or any arbitrary type satisfying concept of Vector<>::Element_t).

Roman.

 Hi again.
Does Array<N,Vector<n,T> > have regular memory alignment for it's data
so that if I have a pointer to the first elment(of T type), I can calculate pointer to 2nd and so on?
 It's all about autovect :)
 they say It supports strided access.
And maybe it'll be advantageous sometimes to make some operations like
+= or *= without PETE?
These operations can not avoid temporaries with PETE, do I understand
it right?
So if I have N '+=' I have N {tmp = LHS + RHS; LHS=tmp} so if RHS is
some complex expression which can benefit from PETE,
if it won't be bunched with LHS, there'll be no big loss, but I can
benefit from vectorizing first statement in {...}.

 Thanks.
 Roman.

P.S. Maybe it'll be good to add new mailing list for crazy ideas ? :)
not to clutter freepooma-devel

> Hi.
> Thanks for multiple patches!
> honestly I haven't tested it yet, I'm still using my hacked version
of InterpolatorCIC.h
> I'll check them somehow this week.
> Performance enhancements mentioned earlier are attractive of course :)
>
> And what about autovectorization: haven't you tried Devang Patel's
'#pragma ivdep' patch in autovect-branch?
> When I tried it - some rejections emerged and no other information
since then. But I even don't know what to anticipate from it - would it be able to vectorize operations on arrays of Vector<,>s? and, is the problem only in aliasing or there are some other obstacles? and would it be worth for operations with explicit L-R dependence like '+='?
>
> By the way, about shortcoming I've encountered earlier: as you had
mentioned, without indexes' boundaries checking the cellContaining() works, but SEGFAULTs. In addition: if position attributes are out of domain, but position[i]-origin (Vector<,> type) has all positive numbers, then there's no SEGFAULT, but strange behaviour occures and segfaults if one of position[i]-origin coordinate is <0.
> So there should be always some BC having coincident boundaries with
mesh(as in PIC2d). And sync() or smth other invoking BCs should be before gather or scatter.
>
> Thanks.
> Roman.
>
>> On Mon, 11 Apr 2005, Richard Guenther wrote:
>>
>>
>>
>>> On Mon, 11 Apr 2005, Richard Guenther wrote:
>>>
>>>
>>>
>>>> On Fri, 1 Apr 2005, Richard Guenther wrote:
>>>>
>>>>
>>>>
>>>>> Hi!
>>>>>
>>>>> This patch canonicalizes the handling of hasInternalGuards_m and
>>>>> hasExternalGuards_m in the various grid layouts. It also disables
>>>>> optimizing away of internal guards if the partitioner will create
>>>>> at most one patch.
>>>>>
>>>>
>>>>
>>>> I believe that the current way of doing things is correct (apart
>>>> from not all places being consistent), i.e. have hasInternalGuards
>>>> and guards() being consistent. What the Interpolators really
>>>> are asking from the Layouts is the maximum supported stencil size,
>>>> a thing the layouts cannot answer atm. For this we need to provide
>>>> functionality.
>>>>
>>>
>>>
>>
>> Like the attached patch. Roman - does this work for you?
>>
>> Richard.
>>
>>
>> 2005Apr11 Richard Guenther <address@hidden>
>>
>> * src/Particles/Interpolation.h (getMaximumStencilWidth):
>> New function.
>> src/Particles/InterpolatorCIC.h: Use it, instead of relying
>> on internal guards.
>> src/Particles/InterpolatorSUDS.h: Likewise.
>>
>>
>> ------------------------------------------------------------------------
>>
>> ? LINUXgcc
>> ? LINUXgcc-opt
>> ? tests/LINUXgcc
>> ? tests/LINUXgcc-opt
>> Index: Interpolation.h
>> ===================================================================
>> RCS file: /cvsroot/freepooma/freepooma/src/Particles/Interpolation.h,v
>> retrieving revision 1.10
>> diff -u -r1.10 Interpolation.h
>> --- Interpolation.h 1 Nov 2004 18:16:59 -0000 1.10
>> +++ Interpolation.h 11 Apr 2005 12:56:24 -0000
>> @@ -220,6 +220,25 @@
>> void setExternalGuards(const Field&, typename Field::Element_t);
>>
>>
>> +/// getMaximumStencilWidth returns the maximum extent a stencil may
>> +/// have if operating on the physical domain.
>> +
>> +template <class Layout>
>> +GuardLayers<Layout::dimensions> getMaximumStencilWidth(const Layout& l)
>> +{
>> + GuardLayers<Layout::dimensions> gl = l.externalGuards();
>> + if (l.sizeGlobal() > 1) {
>> + for (int i=0; i<Layout::dimensions; ++i) {
>> + if (l.internalGuards().lower(i) < gl.lower(i))
>> + gl.lower(i) = l.internalGuards().lower(i);
>> + if (l.internalGuards().upper(i) < gl.upper(i))
>> + gl.upper(i) = l.internalGuards().upper(i);
>> + }
>> + }
>> + return gl;
>> +}
>> +
>> +
>> #include "Particles/Interpolation.cpp"
>>
>>
>> Index: InterpolatorCIC.h
>> ===================================================================
>> RCS file: /cvsroot/freepooma/freepooma/src/Particles/InterpolatorCIC.h,v
>> retrieving revision 1.12
>> diff -u -r1.12 InterpolatorCIC.h
>> --- InterpolatorCIC.h 1 Nov 2004 18:16:59 -0000 1.12
>> +++ InterpolatorCIC.h 11 Apr 2005 12:56:24 -0000
>> @@ -152,7 +152,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate guard layers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -188,7 +188,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate guard layers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -231,7 +231,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -281,7 +281,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -320,7 +320,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -365,7 +365,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -412,7 +412,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -449,7 +449,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -492,7 +492,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for CIC
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> Index: InterpolatorSUDS.h
>> ===================================================================
>> RCS file:
/cvsroot/freepooma/freepooma/src/Particles/InterpolatorSUDS.h,v
>> retrieving revision 1.12
>> diff -u -r1.12 InterpolatorSUDS.h
>> --- InterpolatorSUDS.h 1 Nov 2004 18:16:59 -0000 1.12
>> +++ InterpolatorSUDS.h 11 Apr 2005 12:56:24 -0000
>> @@ -153,7 +153,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate guard layers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -189,7 +189,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate guard layers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -232,7 +232,7 @@
>> "Field and Particle Position must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -282,7 +282,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -321,7 +321,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -366,7 +366,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -413,7 +413,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -450,7 +450,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>> @@ -493,7 +493,7 @@
>> "Field and Particle CacheData must have same number of patches!");
>>
>> // Check that the Field has adequate GuardLayers for SUDS
>> - const GuardLayers<Dim>& gl = field.layout().internalGuards();
>> + GuardLayers<Dim> gl = getMaximumStencilWidth(field.layout());
>> for (int d=0; d<Dim; ++d)
>> {
>> PInsist(gl.lower(d)>=1 && gl.upper(d)>=1,
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Freepooma-devel mailing list
>> address@hidden
>> http://lists.nongnu.org/mailman/listinfo/freepooma-devel
>>
>>
>
>
>
> _______________________________________________
> Freepooma-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/freepooma-devel
>
>



 _______________________________________________
 Freepooma-devel mailing list
 address@hidden
 http://lists.nongnu.org/mailman/listinfo/freepooma-devel







reply via email to

[Prev in Thread] Current Thread [Next in Thread]