octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fftw and memory alignment


From: David Bateman
Subject: Re: fftw and memory alignment
Date: Tue, 17 Feb 2004 17:03:10 +0100
User-agent: Mutt/1.4.1i

According to Paul Kienzle <address@hidden> (on 02/17/04):
> David,
> 
> Regarding aligned memory, I did some searching in Blitz++.
> 
> You can see the code on source-forge:
> 
> http://cvs.sourceforge.net/viewcvs.py/blitz/blitz/blitz/memblock.h?rev=HEAD
> http://cvs.sourceforge.net/viewcvs.py/blitz/blitz/blitz/memblock.cc?rev=HEAD
> 
> The guts of the allocator are in memblock.cc.  
> 
> The idea is what you would expect: allocate a hunk one 
> block size bigger than you need, then use the mod function 
> to find an offset within that block.  You need to keep track 
> of the original pointer so you can free it later.  
> 
> For pointer arithmetic, they are using ptrdiff_t from stddef.h:
> 
>   const int cacheBlockSize = 128;    // Will work for 32, 16 also
>   dataBlockAddress_ = reinterpret_cast<T_type*>
>       (new char[numBytes + cacheBlockSize - 1]);
> 
>   // Shift to the next cache line boundary
>   ptrdiff_t offset = ptrdiff_t(dataBlockAddress_) % cacheBlockSize;
>   int shift = (offset == 0) ? 0 : (cacheBlockSize - offset);
>   data_ = (T_type*)(((char *)dataBlockAddress_) + shift);
> 
> They are only aligning large blocks, not small ones.  I don't
> know what affect that will have on fftw.  If it is unaligned
> does the algorithm collapse?  Or is it just slower?  Or does
> it require more code and wisdom to support?
> 
> There are complications with constructors/destructors. If 
> they exist for the type you are allocating, then you need 
> to explicitly run the constructor/destructor for each element.

There are two problems with unknown alignments.

1) It is slower if it isn't 16 byte aligned since SIMD instructions can't be
   use
2) Have to track the alignment in the planning code.

This idea is pretty much what I thought was worth doing in Array.h but
it might be quite complex since there are places where the pointer
itself is copied and not the data. Could try it though...

Cheers
D.

-- 
David Bateman                                address@hidden
Motorola CRM                                 +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax) 
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary



reply via email to

[Prev in Thread] Current Thread [Next in Thread]