
From:  Tomasz Ożański 
Subject:  Re: SoCiS: maximum entropy method reconstruction for Octave 
Date:  Mon, 30 Mar 2015 14:29:49 +0200 
Useragent:  Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 
W dniu 30.03.2015 o 12:06, Joao F. D. Rodrigues pisze:
Dear Tomasz I read this thread with great interest as I also work with entropy but in a different area. I read through the paper you mentioned and then even this page: http://cmm.cit.nih.gov/maxent/decon4.html but I couldn't really understand how the probability distribution/constraints/observations in the image reconstruction problem. I wonder if you could help me out here (by the way, this would probably fit well as an example in the function documentation): Imagine that my observation is an 100x100 pixel grayscale picture, e.g., each pixel may take values in (010). Hence, the observation is a matrix of size 100x100 whose entries have values between 0 and 10. Can you give similar examples/interpretations of what the constraints/error/prob. dist. in the maxent problem are? Thank you Joao
Well, I'm not any kind of expert here. I'm just willing to implement the algorithm and I *hope* it will also become useful in some other kind of problems. However, I've already had implemented some very simple MEM algorithm few years ago, so I might be able to provide some explanation here.
First, the image you are speaking about should be regarded as "data" or "observation". And what you probably want is to get some estimation of the "input image". Note that the input image has some properties of the probability distribution (if you normalize it so that the sum of all pixels is 1 it essentially is a 2D pmf for a position from which the next photon will come).
You also know the transformation which had been applied to the input image to produce the observation (in the example on the webpage you have given it is a 2D convolution with some point spread function). The observation is also distorted by some noise which was added to it.
We can see that the information was lost here (the convolution removes higher frequencies from the image and the noise gives some uncertainty for the pixel values) so there are many potential input images which could produce what is observed.
For this and many other problems the transformation is just a linear transformation and can be written in a matrix form
y = Tx, where"x" and "y" are the input image and the the observed image respectively (they should be column vectors here, which can be obtained by f.e. stacking the columns of the images). "T" is the transformation matrix, in our case it is a convolution matrix adapted to the "vectorized" shape of the images here, anyway for a linear problem the matrix always exist. Note that the matrix inverse to "T" doesn't have to exist ("T" is often nonsquare or badlyconditioned).
Since we want to compute "x" it is basically an inverse problem, but because of "T"'s properties it is impossible to get meaningful answer by inversion or leastsquares methods (for lsq you usually get an overfit). So the idea is to somehow estimate the set of all plausible input images and choose the one with the highest entropy out of them (you have to treat the image as a probability distribution here).
This is how it is done in the example: since you know the noise (it is Gaussian with some given standard deviation), the total error is given by a chisquare statistic. In the paper they assume that all the images for which the chisquare statistic lies within some alphaconfidence interval are plausible. The algorithm's job is to select the one with the highest entropy out of them. So basically it has to optimize the entropy with the constraint that the chisquare statistic lies within confidence interval. In the article they show that for a convex function (what a chisquared of a linear transformation should be) there is only one solution which lies on the boundary of the confidence region (or the solution is an unconstrained maxent).
The estimate with the maximum entropy has also a nice property that it doesn't have any features apart of those indicated by the data.
I hope it clears things a bit. greetings,  Tomasz Ożański
[Prev in Thread]  Current Thread  [Next in Thread] 