[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: conv2 performance
From: |
John Swensen |
Subject: |
Re: conv2 performance |
Date: |
Mon, 1 Mar 2010 09:17:38 -0500 |
On Feb 28, 2010, at 10:53 AM, John W. Eaton wrote:
>
>
> Maybe there is still room for improvement here. I would happily use a
> free software library with a GPL-compatible license to implement this
> function, but I don't know whether one is available.
>
> jwe
>
I have recently been doing a lot of 2D convolutions. I think the fastest
method should not involve loops of any kind. As convolution in the time domain
(or spatial domain when considering images) is multiplication in the frequency
domain, the fastest method is to take the FFT of both image and kernel,
dot-multiply them, then take the inverse FFT. Since the FFT method is usually
provided by FFTW, this should be optimized and quite fast. Of course, there
has to be some padding that takes place to make sure both 'images' are the same
size. I was using Matlab for this computation and the speed improvement of the
FFT method over the Matlab-provided conv2 was considerable (100 seconds versus
2 second; I was convolving a 2048x2048 image with a 256x256 kernel).
I think the method is formally called the overlap-add method
(http://en.wikipedia.org/wiki/Overlap-add_method). I used a script from
MatlabCentral (no flaming please, as I already saw the discussion that has been
going on for a week or two). This is the method used for many GPGPU
implementations. There is an in-depth description of the best way to do the
padding in an NVidia white paper that can be found at
http://developer.download.nvidia.com/compute/cuda/sdk/website/projects/convolutionFFT2D/doc/convolutionFFT2D.pdf
John
- Re: conv2 performance, (continued)
- Re: conv2 performance, Michael D. Godfrey, 2010/03/03
- Re: conv2 performance, Jaroslav Hajek, 2010/03/03
- Re: conv2 performance, John W. Eaton, 2010/03/03
- Re: conv2 performance, Michael D. Godfrey, 2010/03/03
- Re: conv2 performance, Jaroslav Hajek, 2010/03/04
- Re: conv2 performance, Michael Goffioul, 2010/03/04
- Re: conv2 performance, John W. Eaton, 2010/03/04
- Re: conv2 performance, Jaroslav Hajek, 2010/03/04
- Re: conv2 performance, John W. Eaton, 2010/03/04
Re: conv2 performance, Søren Hauberg, 2010/03/01
Re: conv2 performance,
John Swensen <=
conv2 performance, Lukas Reichlin, 2010/03/03