freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ReRe: [Freepooma-devel] [PATCH] Canonicalize handling of external/in


From: Richard Guenther
Subject: Re: ReRe: [Freepooma-devel] [PATCH] Canonicalize handling of external/internal guards
Date: Mon, 11 Apr 2005 17:08:22 +0200 (CEST)

On Mon, 11 Apr 2005, Roman Krylov wrote:

> Hi.
> Thanks for multiple patches!
> honestly I haven't tested it yet, I'm still using my hacked version of
> InterpolatorCIC.h
> I'll check them somehow this week.

Ok, if the patch works for you, I'll apply it to HEAD and r2_branch.

> Performance enhancements mentioned earlier are attractive of course :)
>
> And what about autovectorization: haven't you tried Devang Patel's
> '#pragma ivdep' patch in autovect-branch?

Yes I did - it didn't make a difference though.  There were remaining
problems with loop optimizations in general, as somehow we ended up
with multiple basic blocks and vectorization doesn't handle this.

I'm currently tracking only CVS HEAD (aka gcc 4.1), but there
vectorization doesn't do anything interesting.  Note that also the
intel compiler cannot do vectorizing of POOMA loops despite of using
#pragma ivdep, so I don't expect gcc to be better soon.

> When I tried it - some rejections emerged and no other information since
> then. But I even don't know what to anticipate from it - would it be
> able to vectorize operations on arrays of Vector<,>s? and, is the
> problem only in aliasing or there are some other obstacles? and would it
> be worth for operations with explicit L-R dependence like '+='?

Well, using a stencil operator, you can of course manually (try
to) vectorize some stuff using processor specific intrinsics - but
this pays off only if you have very few such operators.  It cannot
be applied in general to the Vector<> stuff, as the template metaprograms
used to expand the expressions are not suited to this.

I did some work on optimizing all the Vector operations, but it
turned out we have massive problems with the PETE way of returning
references, so we end up returning references to temporary objects
and segfault.

> By the way, about shortcoming I've encountered earlier: as you had
> mentioned, without indexes' boundaries checking the cellContaining()
> works, but SEGFAULTs. In addition: if position attributes are out of
> domain, but position[i]-origin (Vector<,> type) has all positive
> numbers,  then there's no SEGFAULT, but strange behaviour occures and
> segfaults if one of position[i]-origin coordinate is <0.

It could also segfault, but it's always a matter of wether there is
_some_ memory to read at the bogous location.

> So there should be always some BC having coincident boundaries with
> mesh(as in PIC2d). And sync() or smth other invoking BCs should be
> before gather or scatter.

Yes, probably.  Particles outside of the domain are evil.

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]