gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master 78a1f3c: Book: Edited tutotrials regarding Noi


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master 78a1f3c: Book: Edited tutotrials regarding NoiseChisel usage
Date: Sat, 5 Dec 2020 23:35:10 -0500 (EST)

branch: master
commit 78a1f3c44d20867e47333f5ccd2f8c8eb5ace274
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Book: Edited tutotrials regarding NoiseChisel usage
    
    With the recent updates in NoiseChisel, it was necessary to edit/improve
    the guidelines of optimal NoiseChisel usage in the tutorials. While doing
    this, I also made the following improvements/corrections:
    
     - At the start of the third tutorial, I also added a discussion on dataset
       verification (which is good practice for similar research scenarios.).
    
     - I noticed that the surface brightness level reported in the third
       tutorial was mistakenly being calculated by multiplying the brightness
       by area (instead of dividing them!). Generally, MakeCatalog now has a
       surface brightness column that people can use, so that is used
       now. Because of that typo before, the correct outer surface brightness
       level is actually 25.69 (instead of the previously reported value of
       ~28).
---
 doc/gnuastro.texi | 316 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 191 insertions(+), 125 deletions(-)

diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index 9bae32b..ce17464 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -267,6 +267,12 @@ General program usage tutorial
 
 Detecting large extended targets
 
+* Downloading and validating input data::  How to get and check the input data.
+* NoiseChisel optimization::    Detect the extended and diffuse wings.
+* Achieved surface brightness level::  Calculate the outer surface brightness.
+
+Downloading and validating input data
+
 * NoiseChisel optimization::    Optimize NoiseChisel to dig very deep.
 * Achieved surface brightness level::  Measure how much you detected.
 
@@ -2749,20 +2755,32 @@ To invert the result (only keep the detected pixels), 
you can flip the detection
 $ astarithmetic $in $det not nan where --output=mask-sky.fits
 @end example
 
-Looking again at the detected pixels, we see that there are thin connections 
between many of the smaller objects or extending from larger objects.
-This shows that we have dug in too deep, and that we are following correlated 
noise.
+@cindex Correlated noise
+@cindex Noise, correlated
+Look again at the @code{DETECTIONS} extension, in particular the long 
worm-like structure that has branched out of the galaxy around @footnote{To 
find a particular coordiante easily in DS9, you can do this: Click on the 
``Edit'' menu, and select ``Region''.
+Then click on any random part of the image to see a circle show up in that 
location (this is the ``region'').
+Double-click on the region and a ``Circle'' window will open.
+If you have celestial coordinates, keep the default ``fk5'' in the scroll-down 
menu after the ``Center''.
+But if you have pixel/image coordinates, click on the ``fk5'' and select 
``Image''.
+Now you can set the ``Center'' coordinates of the region (@code{1450} and 
@code{1680} in this case)) by manually typing them in the two boxes infront of 
``Center''.
+Finally, when everything is ready, click on the ``Apply'' button and your 
region will go over your requested coordinates.
+You can zoom out (to see the whole image) and visually find it.} pixel 1450 
(X) and 1680 (Y).
+This these types of long wiggly structures show that we have dug too deep into 
the noise, and that we are following correlated noise.
+Correlated noise is created when we warp (for example rotate) individual 
exposures (that are each slightly offset compared to each other) into the same 
pixel grid before adding them into one deeper image.
+During the warping, nearby pixels are mixed and the effect of this mixing on 
the noise (which is in every pixel) is called ``correlated noise''.
+Correlated noise is a form of convolution and it slightly smooths the image.
 
-Correlated noise is created when we warp datasets from individual exposures 
(that are each slightly offset compared to each other) into the same pixel 
grid, then add them to form the final result.
-Because it mixes nearby pixel values, correlated noise is a form of 
convolution and it smooths the image.
 In terms of the number of exposures (and thus correlated noise), the XDF 
dataset is by no means an ordinary dataset.
 It is the result of warping and adding roughly 80 separate exposures which can 
create strong correlated noise/smoothing.
 In common surveys the number of exposures is usually 10 or less.
+See Figure 2 of @url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]} and 
the discussion on @option{--detgrowquant} there for more on how NoiseChisel 
``grow''s the detected objects and the patterns caused by correlated noise.
 
 Let's tweak NoiseChisel's configuration a little to get a better result on 
this dataset.
 Don't forget that ``@emph{Good statistical analysis is not a purely routine 
matter, and generally calls for more than one pass through the computer}'' 
(Anscombe 1973, see @ref{Science and its tools}).
 A good scientist must have a good understanding of her tools to make a 
meaningful analysis.
-So don't hesitate in playing with the default configuration and reviewing the 
manual when you have a new dataset in front of you.
+So don't hesitate in playing with the default configuration and reviewing the 
manual when you have a new dataset (from a new instrument) in front of you.
 Robust data analysis is an art, therefore a good scientist must first be a 
good artist.
+Once you have found the good configuration for that particular noise pattern 
(instrument) you can safely use it for all new data that have a similar noise 
pattern.
 
 NoiseChisel can produce ``Check images'' to help you visualize and inspect how 
each step is done.
 You can see all the check images it can produce with this command.
@@ -2793,10 +2811,12 @@ In order to understand the parameters and their biases 
(especially as you are st
 
 @cindex FWHM
 The @code{OPENED_AND_LABELED} extension shows the initial detection step of 
NoiseChisel.
-We see these thin connections between smaller points are already present here 
(a relatively early stage in the processing).
-Such connections at the lowest surface brightness limits usually occur when 
the dataset is too smoothed.
-Because of correlated noise, the dataset is already artificially smoothed, 
therefore further smoothing it with the default kernel may be the problem.
-One solution is thus to use a sharper kernel (NoiseChisel's first step in its 
processing).
+We see those thin/false connections due to correlated noise between smaller 
points are already present here (a relatively early stage in the processing).
+Such connections at the lowest surface brightness limits usually occur when 
the dataset is too smoothed, the threshold is too low, or the final ``growth'' 
is too much.
+
+As you see from the 2nd (@code{CONVOLVED}) extension, the first operation that 
NoiseChisel does on the data is to slightly smooth it.
+However, the natural correlated noise of this dataset is already one level of 
artificial smoothing, so further smoothing it with the default kernel may be 
the culprit.
+To see the effect, let's use a sharper kernel as a first step to 
convolve/smooth the input.
 
 By default NoiseChisel uses a Gaussian with full-width-half-maximum (FWHM) of 
2 pixels.
 We can use Gnuastro's MakeProfiles to build a kernel with FWHM of 1.5 pixel 
(truncated at 5 times the FWHM, like the default) using the following command.
@@ -2808,18 +2828,22 @@ $ astmkprof --kernel=gaussian,1.5,5 --oversample=1
 
 @noindent
 Please open the output @file{kernel.fits} and have a look (it is very small 
and sharp).
-We can now tell NoiseChisel to use this instead of the default kernel with the 
following command (we'll keep checking the detection steps)
+We can now tell NoiseChisel to use this instead of the default kernel with the 
following command (we'll keep the @option{--checkdetection} to continue 
checking the detection steps)
 
 @example
 $ astnoisechisel flat-ir/xdf-f160w.fits --kernel=kernel.fits  \
                  --checkdetection
 @end example
 
-Looking at the @code{OPENED_AND_LABELED} extension, we see that the thin 
connections between smaller peaks has now significantly decreased.
+Open the output @file{xdf-f160w_detcheck.fits} as a multi-extension FITS file 
and go to the last extension (@code{DETECTIONS-FINAL}, it is the same pixels as 
the final NoiseChisel output without @option{--checkdetections}).
+Look again at that position mentioned above (1450,1680), you see that the long 
wiggly leg has disappeared.
+This shows we are making progress.
+
+Looking at the new @code{OPENED_AND_LABELED} extension, we see that the thin 
connections between smaller peaks has now significantly decreased.
 Going two extensions/steps ahead (in the first @code{HOLES-FILLED}), you can 
see that during the process of finding false pseudo-detections, too many holes 
have been filled: do you see how the many of the brighter galaxies are 
connected? At this stage all holes are filled, irrespective of their size.
 
 Try looking two extensions ahead (in the first @code{PSEUDOS-FOR-SN}), you can 
see that there aren't too many pseudo-detections because of all those extended 
filled holes.
-If you look closely, you can see the number of pseudo-detections in the result 
NoiseChisel prints (around 5000).
+If you look closely, you can see the number of pseudo-detections in the result 
NoiseChisel prints (around 6100).
 This is another side-effect of correlated noise.
 To address it, we should slightly increase the pseudo-detection threshold 
(before changing @option{--dthresh}, run with @option{-P} to see the default 
value):
 
@@ -2828,12 +2852,15 @@ $ astnoisechisel flat-ir/xdf-f160w.fits 
--kernel=kernel.fits \
                  --dthresh=0.1 --checkdetection
 @end example
 
-Before visually inspecting the check image, you can already see the effect of 
this change in NoiseChisel's command-line output: notice how the number of 
pseudos has increased to more than 6000.
-Open the check image now and have a look, you can see how the 
pseudo-detections are distributed much more evenly in the image.
+Before visually inspecting the check image, you can already see the effect of 
this change in NoiseChisel's command-line output: notice how the number of 
pseudos has increased to more than 6800.
+Open the check image now and have a look, you can see how the 
pseudo-detections are distributed much more evenly in the blank sky regions of 
the @code{PSEUDOS-FOR-SN} extension.
 
 @cartouche
 @noindent
 @strong{Maximize the number of pseudo-detections:} For a new noise-pattern 
(different instrument), play with @code{--dthresh} until you get a maximal 
number of pseudo-detections (the total number of pseudo-detections is printed 
on the command-line when you run NoiseChisel).
+
+In this particular case, try @option{--dthresh=0.2} and you will see that the 
total printed number decreases to around 6400 (recall that with 
@option{--dthresh=0.1}, it was roughly 6100).
+So a @option{--dthresh=0.1} (which gives roughly 6800 is the best value).
 @end cartouche
 
 The signal-to-noise ratio of pseudo-detections define NoiseChisel's reference 
for removing false detections, so they are very important to get right.
@@ -2844,16 +2871,25 @@ $ astnoisechisel flat-ir/xdf-f160w.fits 
--kernel=kernel.fits  \
                  --dthresh=0.1 --checkdetection --checksn
 @end example
 
-The output (@file{xdf-f160w_detsn.fits}) contains two extensions for the 
pseudo-detections over the undetected (sky) regions and those over detections.
-The first column is the pseudo-detection label which you can see in the 
respective@footnote{The first @code{PSEUDOS-FOR-SN} in 
@file{xdf-f160w_detsn.fits} is for the pseudo-detections over the undetected 
regions and the second is for those over detected regions.} 
@code{PSEUDOS-FOR-SN} extension of @file{xdf-f160w_detcheck.fits}.
-You can see the table columns with the first command below and get a feeling 
for its distribution with the second command (the two Table and Statistics 
programs will be discussed later in the tutorial)
+The output (@file{xdf-f160w_detsn.fits}) contains two extensions for the 
pseudo-detections over the undetected (@code{SKY_PSEUDODET_SN}) regions and 
those over detections (@code{DET_PSEUDODET_SN}):
+
+@example
+$ astfits xdf-f160w_detsn.fits
+$ asttable xdf-f160w_detsn.fits -i
+@end example
+
+@noindent
+As you see from the output of the commands above, both tables contain two 
columns.
+The first column is the pseudo-detection label which you can see in the 
respective @code{PSEUDOS-FOR-SN} extension of the check image 
(@file{xdf-f160w_detcheck.fits}).
+You can see the table columns with the first command below and get a feeling 
of the signal-to-noise value distribution with the second command (the two 
Table and Statistics programs will be discussed later in the tutorial)
 
 @example
 $ asttable xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN
 $ aststatistics xdf-f160w_detsn.fits -hSKY_PSEUDODET_SN -c2
 @end example
 
-The correlated noise is again visible in this pseudo-detection signal-to-noise 
distribution: it is highly skewed.
+The correlated noise is again visible in the signal-to-noise distribution of 
sky pseudo-detections: it is highly skewed.
+In an image with less correlated noise, this distribution would be much more 
symmetric.
 A small change in the quantile will translate into a big change in the S/N 
value.
 For example see the difference between the three 0.99, 0.95 and 0.90 quantiles 
with this command:
 
@@ -2876,6 +2912,7 @@ $ astarithmetic $in $det nan where --output=mask-det.fits
 @end example
 
 Overall it seems good, but if you play a little with the color-bar and look 
closer in the noise, you'll see a few very sharp, but faint, objects that have 
not been detected.
+For example the sharp peak around pixel (2084, 1434).
 This only happens for under-sampled datasets like HST (where the pixel size is 
larger than the point spread function FWHM).
 So this won't happen on ground-based images.
 Because of this, sharp and faint objects will be very small and eroded too 
easily during NoiseChisel's erosion step.
@@ -2890,8 +2927,12 @@ $ astnoisechisel flat-ir/xdf-f160w.fits 
--kernel=kernel.fits     \
                  --noerodequant=0.95 --dthresh=0.1 --snquant=0.95
 @end example
 
-This seems to be fine and we can continue with our analysis.
-To avoid having to write these options on every call to NoiseChisel, we'll 
just make a configuration file in a visible @file{config} directory.
+This seems to be fine and we'll stop the configuration here, but please feel 
free to keep looking into the data to see if you can improve it even more.
+
+Once you have found the proper customization for the type of images you will 
be using you don't need to change them any more.
+The same configuration can be used for any dataset that has been similarly 
produced (and has a similar noise pattern).
+But entering all these options on every call to NoiseChisel is annoying and 
prone to bugs (mistakenly typing the wrong value for example).
+To simply things, we'll make a configuration file in a visible @file{config} 
directory.
 Then we'll define the hidden @file{.gnuastro} directory (that all Gnuastro's 
programs will look into for configuration files) as a symbolic link to the 
@file{config} directory.
 Finally, we'll write the finalized values of the options into NoiseChisel's 
standard configuration file within that directory.
 We'll also put the kernel in a separate directory to keep the top directory 
clean of any files we later need.
@@ -2907,8 +2948,7 @@ $ echo "snquant      0.95"             >> 
config/astnoisechisel.conf
 @end example
 
 @noindent
-We are now ready to finally run NoiseChisel on the two filters and keep the
-output in a dedicated directory (@file{nc}).
+We are now ready to finally run NoiseChisel on the three filters and keep the 
output in a dedicated directory (which we'll call @file{nc} for simplicity).
 @example
 $ rm *.fits
 $ mkdir nc
@@ -3908,6 +3948,17 @@ Basic features like access to this book on the 
command-line, the configuration f
 We'll try to detect the faint tidal wings of the beautiful M51 
group@footnote{@url{https://en.wikipedia.org/wiki/M51_Group}} in this tutorial.
 We'll use a dataset/image from the public @url{http://www.sdss.org/, Sloan 
Digital Sky Survey}, or SDSS.
 Due to its more peculiar low surface brightness structure/features, we'll 
focus on the dwarf companion galaxy of the group (or NGC 5195).
+
+
+@menu
+* Downloading and validating input data::  How to get and check the input data.
+* NoiseChisel optimization::    Detect the extended and diffuse wings.
+* Achieved surface brightness level::  Calculate the outer surface brightness.
+@end menu
+
+@node Downloading and validating input data, NoiseChisel optimization, 
Detecting large extended targets, Detecting large extended targets
+@subsection Downloading and validating input data
+
 To get the image, you can use SDSS's @url{https://dr12.sdss.org/fields, Simple 
field search} tool.
 As long as it is covered by the SDSS, you can find an image containing your 
desired target either by providing a standard name (if it has one), or its 
coordinates.
 To access the dataset we will use here, write @code{NGC5195} in the ``Object 
Name'' field and press ``Submit'' button.
@@ -3935,10 +3986,50 @@ $ 
topurl=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
 $ wget $topurl/301/3716/6/frame-r-003716-6-0117.fits.bz2 -Or.fits.bz2
 @end example
 
+When you want to reproduce a previous result (a known analysis, on a known 
dataset, to get a known result: like the case here!) it is important to verify 
that the file is correct: input file hasn't changed (on the remote server, or 
in your own archive), or there was no downloading problem.
+Otherwise, if the data have changed in your server/archive, and you use the 
same script, you will get a different result, causing a lot of confusion!
+
+@cindex Checksum
+@cindex SHA-1 checksum
+@cindex Verification, checksum
+One good way to verify the contents of a file is to store its @emph{Checksum} 
in your analysis script and check it before any other operation.
+The @emph{Checksum} algorithms look into the contents of a file and calculate 
a fixed-length string from them.
+If any change (even in a bit or byte) is made within the file, the resulting 
string will change, for more see @url{https://en.wikipedia.org/wiki/Checksum, 
Wikipedia}.
+There are many common algorithms, but a simple one is the 
@url{https://en.wikipedia.org/wiki/SHA-1, SHA-1 algorithm} (Secure Hash 
Algorithm 1) that you can calculate easily with the command below (the second 
line is the output, and the checksum is the first/long string: it is 
independent of the file name)
+
+@example
+$ sha1sum r.fits.bz2
+5fb06a572c6107c72cbc5eb8a9329f536c7e7f65  r.fits.bz2
+@end example
+
+If the output on your computer is different from this, either the file has 
been incorrectly downloaded (most probable), or it has changed on SDSS servers 
(very unlikely@footnote{If your checksum is different, try uncompressing the 
file with the @command{bunzip2} command after this, and open the resulting FITS 
file.
+If it opens and you see the image of M51, then there was no download problem, 
and the file has indeed changed.
+In this case, please contact us at @code{bug-gnuastro@@gnu.org}.}).
+To get a better feeling of checksums open your favorite text editor and make a 
test file by writing something in it.
+Save it and calculate the new file's SHA-1 checksum with @command{sha1sum}.
+Try renaming that file, and you'll see the checksum hasn't changed (checksums 
only look into the contents, not the name/location of the file).
+Then open the file with your text editor again, make a change and re-calculate 
its checksum, you'll see the checksum string has changed.
+
+Its always good to keep this short checksum string with your project's scripts 
and validate your input data before using them.
+You can do this with a shell conditional like this:
+
+@example
+filename=r.fits.bz2
+expected=5fb06a572c6107c72cbc5eb8a9329f536c7e7f65
+sum=$(sha1sum $filename | awk '@{print $1@}')
+if [ $sum = $expected ]; then
+  echo "$filename: validated"
+else
+  echo "$filename: wrong checksum!"
+  exit 1
+fi
+@end example
+
 @cindex Bzip2
 @noindent
-This server keeps the files in a Bzip2 compressed file format.
-So we'll first decompress it with the following command.
+Now that we know you have the same data that we wrote this tutorial with, 
let's continue.
+The SDSS server keeps the files in a Bzip2 compressed file format (that have a 
@code{.bz2} suffix).
+So we'll first decompress it with the following command to use it as a normal 
FITS file.
 By convention, compression programs delete the original file (compressed when 
uncompressing, or uncompressed when compressing).
 To keep the original file, you can use the @option{--keep} or @option{-k} 
option which is available in most compression programs for this job.
 Here, we don't need the compressed file any more, so we'll just let 
@command{bunzip} delete it for us and keep the directory clean.
@@ -3953,7 +4044,7 @@ $ bunzip2 r.fits.bz2
 * Achieved surface brightness level::  Measure how much you detected.
 @end menu
 
-@node NoiseChisel optimization, Achieved surface brightness level, Detecting 
large extended targets, Detecting large extended targets
+@node NoiseChisel optimization, Achieved surface brightness level, Downloading 
and validating input data, Detecting large extended targets
 @subsection NoiseChisel optimization
 In @ref{Detecting large extended targets} we downloaded the single exposure 
SDSS image.
 Let's see how NoiseChisel operates on it with its default parameters:
@@ -3971,30 +4062,31 @@ $ ds9 -mecube r_detected.fits -zscale -zoom to fit
 @end example
 
 Flipping through the extensions in a FITS viewer, you will see that the first 
image (Sky-subtracted image) looks reasonable: there are no major artifacts due 
to bad Sky subtraction compared to the input.
-The second extension also seems reasonable with a large detection map that 
covers the whole of NGC5195, but also extends beyond towards the bottom of the 
image.
+The second extension also seems reasonable with a large detection map that 
covers the whole of NGC5195, but also extends towards the bottom of the image 
where we actually see faint and diffuse signal in the input image.
 
 Now try flipping between the @code{DETECTIONS} and @code{SKY} extensions.
 In the @code{SKY} extension, you'll notice that there is still significant 
signal beyond the detected pixels.
-You can tell that this signal belongs to the galaxy because the far-right side 
of the image is dark and the brighter parts are just surrounding the detected 
pixels.
+You can tell that this signal belongs to the galaxy because the far-right side 
of the image (away from M51) is dark (has lower values) and the brighter parts 
in the Sky image (with larger values) are just under the detections and follow 
a similar pattern.
 
-The fact that signal from the galaxy remains in the @code{SKY} HDU shows that 
you haven't done a good detection.
+The fact that signal from the galaxy remains in the @code{SKY} HDU shows that 
NoiseChisel can be optimized for a much better result.
 The @code{SKY} extension must not contain any light around the galaxy.
-Generally, any time your target is much larger than the tile size and the 
signal is almost flat (like this case), this @emph{will} happen.
+Generally, any time your target is much larger than the tile size and the 
signal is very diffuse and extended at low signal-to-noise values (like this 
case), this @emph{will} happen.
 Therefore, when there are large objects in the dataset, @strong{the best 
place} to check the accuracy of your detection is the estimated Sky image.
 
 When dominated by the background, noise has a symmetric distribution.
 However, signal is not symmetric (we don't have negative signal).
-Therefore when non-constant signal is present in a noisy dataset, the 
distribution will be positively skewed.
-This skewness is a good measure of how much signal we have in the distribution.
-The skewness can be accurately measured by the difference in the mean and 
median: assuming no strong outliers, the more distant they are, the more skewed 
the dataset is.
+Therefore when non-constant@footnote{by constant, we mean that it has a single 
value in the region we are measuring.} signal is present in a noisy dataset, 
the distribution will be positively skewed.
+For a demonstration, see Figure 1 of @url{https://arxiv.org/abs/1505.01664, 
Akhlaghi and Ichikawa [2015]}.
+This skewness is a good measure of how much faint signal we have in the 
distribution.
+The skewness can be accurately measured by the difference in the mean and 
median (assuming no strong outliers): the more distant they are, the more 
skewed the dataset is.
 For more see @ref{Quantifying signal in a tile}.
 
 However, skewness is only a proxy for signal when the signal has structure 
(varies per pixel).
-Therefore, when it is approximately constant over a whole tile, or sub-set of 
the image, the signal's effect is just to shift the symmetric center of the 
noise distribution to the positive and there won't be any skewness (major 
difference between the mean and median).
+Therefore, when it is approximately constant over a whole tile, or sub-set of 
the image, the constant signal's effect is just to shift the symmetric center 
of the noise distribution to the positive and there won't be any skewness 
(major difference between the mean and median).
 This positive@footnote{In processed images, where the Sky value can be 
over-estimated, this constant shift can be negative.} shift that preserves the 
symmetric distribution is the Sky value.
 When there is a gradient over the dataset, different tiles will have different 
constant shifts/Sky-values, for example see Figure 11 of 
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.
 
-To get less scatter in measuring the mean and median (and thus better estimate 
the skewness), you will need a larger tile.
+To make this very large diffuse/flat signal detectable, you will therefore 
need a larger tile to contain a larger change in the values within it (and 
improve number statistics, for less scatter when measuring the mean and median).
 So let's play with the tessellation a little to see how it affects the result.
 In Gnuastro, you can see the option values (@option{--tilesize} in this case) 
by adding the @option{-P} option to your last command.
 Try running NoiseChisel with @option{-P} to see its default tile size.
@@ -4011,13 +4103,13 @@ Did you see how NoiseChisel aborted after finding and 
applying the quantile thre
 When you call any of NoiseChisel's @option{--check*} options, by default, it 
will abort as soon as all the check steps have been written in the check file 
(a multi-extension FITS file).
 This allows you to focus on the problem you wanted to check as soon as 
possible (you can disable this feature with the @option{--continueaftercheck} 
option).
 
-To optimize the threshold-related settings for this image, let's playing with 
this quantile threshold check image a little.
+To optimize the threshold-related settings for this image, let's play with 
this quantile threshold check image a little.
 Don't forget that ``@emph{Good statistical analysis is not a purely routine 
matter, and generally calls for more than one pass through the computer}'' 
(Anscombe 1973, see @ref{Science and its tools}).
 A good scientist must have a good understanding of her tools to make a 
meaningful analysis.
-So don't hesitate in playing with the default configuration and reviewing the 
manual when you have a new dataset in front of you.
+So don't hesitate in playing with the default configuration and reviewing the 
manual when you have a new dataset (from a new instrument) in front of you.
 Robust data analysis is an art, therefore a good scientist must first be a 
good artist.
 
-The first extension of @file{r_qthresh.fits} (@code{CONVOLVED}) is the 
convolved input image where the threshold(s) is(are) defined and applied.
+The first extension (called @code{CONVOLVED}) of @file{r_qthresh.fits} is the 
convolved input image where the threshold(s) is(are) defined (and later applied 
to).
 For more on the effect of convolution and thresholding, see Sections 3.1.1 and 
3.1.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.
 The second extension (@code{QTHRESH_ERODE}) has a blank value for all the 
pixels of any tile that was identified as having significant signal.
 The next two extensions (@code{QTHRESH_NOERODE} and @code{QTHRESH_EXPAND}) are 
the other two quantile thresholds that are necessary in NoiseChisel's later 
steps.
@@ -4028,12 +4120,12 @@ As one line of attack against discarding too much 
signal below the threshold, No
 Go forward by three extensions to @code{VALUE1_NO_OUTLIER} and you will see 
that many of the tiles over the galaxy have been removed in this step.
 For more on the outlier rejection algorithm, see the latter half of 
@ref{Quantifying signal in a tile}.
 
-However, the default outlier rejection parameters weren't enough, and when you 
play with the color-bar, you still see a strong gradient around the outer tidal 
feature of the galaxy.
+However, the default outlier rejection parameters weren't enough, and when you 
play with the color-bar, you still see a gradient near the outer tidal feature 
of the galaxy.
 You have two strategies for fixing this problem: 1) Increase the tile size to 
get more accurate measurements of skewness.
 2) Strengthen the outlier rejection parameters to discard more of the tiles 
with signal.
 Fortunately in this image we have a sufficiently large region on the right of 
the image that the galaxy doesn't extend to.
 So we can use the more robust first solution.
-In situations where this doesn't happen (for example if the field of view in 
this image was shifted to have more of M51 and less sky) you are limited to a 
combination of the two solutions or just to the second solution.
+In situations where this doesn't happen (for example if the field of view in 
this image was shifted to the left to have more of M51 and less sky) you are 
limited to a combination of the two solutions or just to the second solution.
 
 @cartouche
 @noindent
@@ -4063,14 +4155,17 @@ Let's check the next series of detection steps by 
adding the @code{--checkdetect
 $ astnoisechisel r.fits -h0 --tilesize=100,100 --checkdetection
 @end example
 
+@cindex Erosion (image processing)
 The output now has 16 extensions, showing every step that is taken by 
NoiseChisel.
 The first and second (@code{INPUT} and @code{CONVOLVED}) are clear from their 
names.
 The third (@code{THRESHOLDED}) is the thresholded image after finding the 
quantile threshold (last extension of the output of @code{--checkqthresh}).
-The fourth HDU (@code{ERODED} is new: its the name-stake of NoiseChisel, or 
eroding pixels that are above the threshold.
+The fourth HDU (@code{ERODED}) is new: its the name-stake of NoiseChisel, or 
eroding pixels that are above the threshold.
 By erosion, we mean that all pixels with a value of @code{1} (above the 
threshold) that are touching a pixel with a value of @code{0} (below the 
threshold) will be flipped to zero (or ``carved'' out)@footnote{Pixels with a 
value of @code{2} are very high signal-to-noise pixels, they are not eroded, to 
preserve sharp and bright sources.}.
-You can see its effect directly by flipping between the @code{THRESHOLDED} and 
@code{ERODED} extensions.
+You can see its effect directly by going back and forth between the 
@code{THRESHOLDED} and @code{ERODED} extensions.
 
-In the fifth extension (@code{OPENED-AND-LABELED}) the image is ``opened'', 
which is a name for erodeding once, then dilating. This is good to remove thin 
connections that are only due to noise.
+@cindex Dilation (image processing)
+In the fifth extension (@code{OPENED-AND-LABELED}) the image is ``opened'', 
which is a name for eroding once, then dilating (dilation is the inverse of 
erosion).
+This is good to remove thin connections that are only due to noise.
 Each separate connected group of pixels is also given its unique label here.
 Do you see how just beyond the large M51 detection, there are many smaller 
detections that get smaller as you go more distant?
 This hints at the solution: the default number of erosions is too much.
@@ -4081,14 +4176,15 @@ $ astnoisechisel r.fits -h0 --tilesize=100,100 -P | 
grep erode
 @end example
 
 @noindent
-We see that the value to @code{erode} is @code{2}!
-The default NoiseChisel parameters are primarily targetted to processed images 
(where there is correlated noise due to all the processing that has gone into 
the warping and stacking of raw images).
+We see that the value of @code{erode} is @code{2}.
+The default NoiseChisel parameters are primarily targetted to processed images 
(where there is correlated noise due to all the processing that has gone into 
the warping and stacking of raw images, see @ref{NoiseChisel optimization for 
detection}).
 In those scenarios 2 erosions are commonly necessary.
 But here, we have a single-exposure image where there is no correlated noise 
(the pixels aren't mixed).
 So let's see how things change with only one erosion:
 
 @example
-$ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 --checkdetection
+$ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
+                 --checkdetection
 @end example
 
 Looking at the @code{OPENED-AND-LABELED} extension again, we see that the 
main/large detection is now much larger than before.
@@ -4106,37 +4202,26 @@ $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1
 
 The @code{DETECTIONS} extension of @code{r_detected.fits} closely follows what 
the @code{DETECTION-FINAL} of the check image (looks good!).
 But if you go ahead to the @code{SKY} extension, you'll see the footprint of 
the galaxy again!
-It is MUCH better than before, but still problematic.
-
-The Sky is estimated over the same tiles that were used for quantile 
thresholding, but this time, we first look at what fraction of a tile has 
un-detected pixels: the tiles that are fully covered by a detection will have a 
sky (un-detected) fraction of 0, and if it has no detections within it, it will 
have a sky fraction of 1.
-You can define your acceptable sky fraction through the @code{--minskyfrac} 
option.
-Please run the previous command with @code{-P} to see its default value.
-Let's increase @code{--minskyfrac} to @code{0.8} to see how it looks:
+It is MUCH better (weaker) than before, but still visible.
 
-@example
-$ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
-                 --minskyfrac=0.8
-@end example
-
-The M51 footprint on the @code{SKY} extension is now much less, which is good!
-But it is still present! This is happening because the wings are so diffuse 
and large.
+This is happening because the outer low surface brightness wings are so 
diffuse and large.
 Look at the @code{DETECTIONS} again, you will see the right-ward edges of 
M51's detected pixels have many ``holes'' that are fully surrounded by signal 
(value of @code{1}) and the signal stretches out in the noise very thinly (the 
size of the holes increases as we go out).
 This suggests that there is still undetected signal that is causing that bias 
in the @code{SKY} extension.
 Therefore, we should dig deeper into the noise.
 
-With the @option{--detgrowquant} option, NoiseChisel will use the detections 
as seeds and grow them in to the noise.
+With the @option{--detgrowquant} option, NoiseChisel will ``grow'' the 
detections in to the noise.
 Its value is the ultimate limit of the growth in units of quantile (between 0 
and 1).
 Therefore @option{--detgrowquant=1} means no growth and 
@option{--detgrowquant=0.5} means an ultimate limit of the Sky level (which is 
usually too much and will cover the whole image!).
 See Figure 2 of arXiv:@url{https://arxiv.org/pdf/1909.11230.pdf,1909.11230} 
for more on this option.
 Try running the previous command with various values (from 0.6 to higher 
values) to see this option's effect on this dataset.
-For this particularly huge galaxy (with signal that extends very gradually 
into the noise), we'll set it to @option{0.7}:
+For this particularly huge galaxy (with signal that extends very gradually 
into the noise), we'll set it to @option{0.75}:
 
 @example
 $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
-                 --minskyfrac=0.8 --detgrowquant=0.8
+                 --detgrowquant=0.75
 @end example
 
-Beyond this level (smaller @option{--detgrowquant} values), you see the 
smaller background galaxies starting to create thin spider-leg-like features, 
showing that we are following correlated noise for too much.
+Beyond this level (smaller @option{--detgrowquant} values), you see many of 
the smaller background galaxies (towards the right side of the image) starting 
to create thin spider-leg-like features, showing that we are following 
correlated noise for too much.
 Please try it for your self by changing it to @code{0.6} for example.
 
 When you look at the @code{DETECTIONS} extension of the command shown above, 
you see the wings of the galaxy being detected much farther out, But you also 
see many holes which are clearly just caused by noise.
@@ -4145,12 +4230,12 @@ In this case, a maximum area/size of 10,000 pixels 
seems to be good:
 
 @example
 $ astnoisechisel r.fits -h0 --tilesize=100,100 --erode=1 \
-                 --minskyfrac=0.8 --detgrowquant=0.8 \
-                 --detgrowmaxholesize=10000
+                 --detgrowquant=0.75 --detgrowmaxholesize=10000
 @end example
 
-The footprint of the galaxy has almost disappeared (the maximum value is now 
outside the galaxy) but still exists in the @code{SKY} extension.
-Let's calculate the significance of the undetected gradient, in units of noise.
+The footprint of the galaxy has almost disappeared and the galaxy footprint no 
longer has the same shape as the outskirts of the galaxy.
+But it is still present in the @code{SKY} extension.
+Before continuing, let's pause for a moment and calculate the significance of 
this gradient in the Sky, in units of the image noise.
 Since the gradient is roughly along the horizontal axis, we'll collapse the 
image along the second (vertical) FITS dimension to have a 1D array (a table 
column, see its values with the second command).
 
 @example
@@ -4158,7 +4243,8 @@ $ astarithmetic r_detected.fits 2 collapse-mean -hSKY 
-ocollapsed.fits
 $ asttable collapsed.fits
 @end example
 
-We can now calculate the minimum and maximum values of this array and define 
their difference (in units of noise) as the gradient:
+We can now calculate the minimum and maximum values of this array and define 
their difference (in units of noise) as the gradient.
+We'll then calculate the mean noise value in the image and divide the gradient 
by that noise level.
 
 @example
 $ grad=$(astarithmetic r_detected.fits 2 collapse-mean set-i   \
@@ -4169,29 +4255,44 @@ $ echo $std
 $ astarithmetic -q $grad $std /
 @end example
 
-The undetected gradient (@code{grad} above) is thus almost 1/10th of the 
noise-level (which is good).
-But don't forget that this is per-pixel: individually its small, but it 
extends over millions of pixels, so the total flux may still be relevant.
+The undetected gradient (@code{grad} above) is thus almost 1/100th of the 
noise-level (which is great!).
+But don't forget that this is per-pixel: individually its (very!) small, but 
it extends over millions of pixels, so the total flux may still be relevant for 
a very precise measurement of the diffuse emission.
 
-When looking at the raw input shallow image, you don't see anything so far out 
of the galaxy.
-You might just think that ``this is all noise, I have just dug too deep and 
I'm following systematics''! If you feel like this, have a look at the deep 
images of this system in @url{https://arxiv.org/abs/1501.04599, Watkins et al. 
[2015]}, or a 12 hour deep image of this system (with a 12-inch telescope): 
@url{https://i.redd.it/jfqgpqg0hfk11.jpg}@footnote{The image is taken from this 
Reddit discussion: 
@url{https://www.reddit.com/r/Astronomy/comments/9d6x0q/12_hours_of_exposure_on_the_wh
 [...]
+@cartouche
+@noindent
+@strong{Advanced Arithmetic:} To derive the significance of the gradient in 
the commands above, we had to create a temporary file and generally manually 
press a lot of keys!
+But thanks to Arithmetic's @ref{Reverse polish notation}, you can do all the 
steps above in the single command that is shown below (which is also faster!).
+The Arithmetic program is a very powerful and unique Gnuastro feature, if you 
master it, you can do many complex operations very easily and fast (both for 
you who are typing, and for the computer that is processing!).
+
+@example
+astarithmetic -q r_detected.fits -hSKY 2 collapse-mean set-i \
+              i maxvalue i minvalue - \
+              r_detected.fits -hSKY_STD meanvalue  /
+@end example
+@end cartouche
+
+When looking at the raw input image (which is very ``shallow'': less than a 
minute exposure!), you don't see anything so far out of the galaxy.
+You might just think to yourself that ``this is all noise, I have just dug too 
deep and I'm following systematics''! If you feel like this, have a look at the 
deep images of this system in @url{https://arxiv.org/abs/1501.04599, Watkins et 
al. [2015]}, or a 12 hour deep image of this system (with a 12-inch telescope): 
@url{https://i.redd.it/jfqgpqg0hfk11.jpg}@footnote{The image is taken from this 
Reddit discussion: 
@url{https://www.reddit.com/r/Astronomy/comments/9d6x0q/12_hours_of_exposu [...]
 In these deeper images you clearly see how the outer edges of the M51 group 
follow this exact structure, below in @ref{Achieved surface brightness level}, 
we'll measure the exact level.
 
 As the gradient in the @code{SKY} extension shows, and the deep images cited 
above confirm, the galaxy's signal extends even beyond this.
 But this is already far deeper than what most (if not all) other tools can 
detect.
-Therefore, we'll stop configuring NoiseChisel at this point in the tutorial 
and let you play with it a little more while reading more about it in 
@ref{NoiseChisel}.
-
-After finishing this tutorial please go through the NoiseChisel paper and its 
options and play with them to see if you can further decrease the gradient.
-This will greatly help you get a good feeling of the options.
+Therefore, we'll stop configuring NoiseChisel at this point in the tutorial 
and let you play with the other options a little more, while reading more about 
it in the papers (@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa 
[2015]} and @url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}) and 
@ref{NoiseChisel}.
 When you do find a better configuration feel free to contact us for feedback.
 Don't forget that good data analysis is an art, so like a sculptor, master 
your chisel for a good result.
 
+To avoid typing all these options every time you run NoiseChisel on this 
image, you can use Gnuastro's configuration files, see @ref{Configuration 
files}.
+For an applied example of setting/using them, see @ref{Option management and 
configuration files}.
+
 @cartouche
 @noindent
-@strong{This NoiseChisel configuration is NOT GENERIC:} Don't use the 
configuration derived above, on another image blindly.
+@strong{This NoiseChisel configuration is NOT GENERIC:} Don't use the 
configuration derived above, on another image @emph{blindly}.
 If you are unsure, just use the default values.
 As you saw above, the reason we chose this particular configuration for 
NoiseChisel to detect the wings of the M51 group was strongly influenced by the 
noise properties of this particular image.
-So as long as your image noise has similar properties (from the same 
data-reduction step of the same database), you can use this configuration on 
any image.
-But for images from other instruments, or higher-level/reduced SDSS products, 
please follow a similar logic to what was presented here and find the best 
configuration yourself.
+Remember @ref{NoiseChisel optimization for detection}, where we looked into 
the very deep XDF image which had strong correlated noise?
+
+As long as your other images have similar noise properties (from the same 
data-reduction step of the same instrument), you can use your configuration on 
any of them.
+But for images from other instruments, please follow a similar logic to what 
was presented in these tutorials to find the optimal configuration.
 @end cartouche
 
 @cartouche
@@ -4268,10 +4369,10 @@ You'll see that the detected edge of the M51 group is 
now clearly visible.
 You can use @file{edge.fits} to mark (set to blank) this boundary on the input 
image and get a visual feeling of how far it extends:
 
 @example
-$ astarithmetic r.fits edge.fits nan where -ob-masked.fits -h0
+$ astarithmetic r.fits edge.fits nan where -oedge-masked.fits -h0
 @end example
 
-To quantify how deep we have detected the low-surface brightness regions, 
we'll use the command below.
+To quantify how deep we have detected the low-surface brightness regions (in 
units of signal to-noise ratio), we'll use the command below.
 In short it just divides all the non-zero pixels of @file{edge.fits} in the 
Sky subtracted input (first extension of NoiseChisel's output) by the pixel 
standard deviation of the same pixel.
 This will give us a signal-to-noise ratio image.
 The mean value of this image shows the level of surface brightness that we 
have achieved.
@@ -4294,56 +4395,21 @@ $ astarithmetic $skysub $skystd / $edge not nan where   
    \
 
 @cindex Surface brightness
 We have thus detected the wings of the M51 group down to roughly 1/3rd of the 
noise level in this image! But the signal-to-noise ratio is a relative 
measurement.
-Let's also measure the depth of our detection in absolute surface brightness 
units; or magnitudes per square arcseconds.
-To find out, we'll first need to calculate how many pixels of this image are 
in one arcsecond-squared.
-Gnuastro Fits program has the @option{--pixelscale} option for this job:
+Let's also measure the depth of our detection in absolute surface brightness 
units; or magnitudes per square arcseconds (see @ref{Brightness flux 
magnitude}).
+Fortunately Gnuastro's MakeCatalog does this operation easily.
+SDSS image pixel values are calibrated in units of ``nanomaggy'', so the zero 
point magnitude is 22.5@footnote{From 
@url{https://www.sdss.org/dr12/algorithms/magnitudes}}.
 
 @example
-## Get the image pixel scale:
-$ astfits r.fits -h0 --pixelscale
-
-## Avoid all the human-friendly text (just print numbers):
-$ astfits r.fits -h0 --pixelscale --quiet
-
-## Take the first value, and convert it to arcseconds.
-$ pixscale=$(astfits r.fits -h0 --pixelscale --quiet \
-                     | awk '@{print $1*3600@}')
-
-## For a check:
-$ echo $pixscale
-@end example
-
-@noindent
-Note that we multiplied the value by 3600 so we work in units of arc-seconds 
not degrees.
-Now, let's calculate the average sky-subtracted flux in the border region per 
pixel.
-
-@example
-$ f=$(astarithmetic r_detected.fits edge.fits not nan where set-i \
-                    i sumvalue i numbervalue / -q -hINPUT-NO-SKY)
-$ echo $f
-@end example
-
-@noindent
-We can just multiply the two to get the average flux on this border in one 
arcsecond squared.
-The pixel values are calibrated to be in units of ``nanomaggy'', so the zero 
point magnitude is 22.5@footnote{From 
@url{https://www.sdss.org/dr12/algorithms/magnitudes}}.
-Therefore we can get the surface brightness of the outer edge (in magnitudes 
per arcsecond squared) using the following command.
-Just note that @code{log} in AWK is in base-2 (not 10), and that AWK doesn't 
have a @code{log10} operator.
-So we'll do an extra division by @code{log(10)} to correct for this.
-
-@example
-$ z=22.5
-$ echo "$pixscale $f $z" | awk '@{print -2.5*log($1*$2)/log(10)+$3@}'
---> 28.6269
+astmkcatalog edge.fits -h1 --valuesfile=r_detected.fits \
+             --zeropoint=22.5 --ids --surfacebrightness
+asttable edge_cat.fits
 @end example
 
-On a single-exposure SDSS image, we have reached a surface brightness limit 
fainter than 28 magnitudes per arcseconds squared!
-
+We have thus reached an outer surface brightness of @mymath{25.69} 
magnitudes/arcsec@mymath{^2} (second column in @file{edge_cat.fits}) on this 
single exposure SDSS image!
 In interpreting this value, you should just have in mind that NoiseChisel 
works based on the contiguity of signal in the pixels.
-Therefore the larger the object, the deeper NoiseChisel can carve it out of 
the noise.
-In other words, this reported depth, is only for this particular object and 
dataset, processed with this particular NoiseChisel configuration: if the M51 
group in this image was larger/smaller than this, or if the image was 
larger/smaller, or if we had used a different configuration, we would go 
deeper/shallower.
-
-To avoid typing all these options every time you run NoiseChisel on this 
image, you can use Gnuastro's configuration files, see @ref{Configuration 
files}.
-For an applied example of setting/using them, see @ref{Option management and 
configuration files}.
+Therefore the larger the object (with a similarly diffuse emission), the 
deeper NoiseChisel can carve it out of the noise.
+In other words, this reported depth, is the depth we have reached for this 
object in this dataset, processed with this particular NoiseChisel 
configuration.
+If the M51 group in this image was larger/smaller than this, or if the image 
was from a different instrument, or if we had used a different configuration, 
we would go deeper/shallower.
 
 To continue your analysis of such datasets with extended emission, you can use 
@ref{Segment} to identify all the ``clumps'' over the diffuse regions: 
background galaxies and foreground stars.
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]