[task #15803] Match program builds k-d tree and later read from it

gnuastro-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[task #15803] Match program builds k-d tree and later read from it

From:	Mohammad Akhlaghi
Subject:	[task #15803] Match program builds k-d tree and later read from it
Date:	Sun, 28 Mar 2021 21:49:27 -0400 (EDT)
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0

Update of task #15803 (project gnuastro):

        Percent Complete:                     70% => 90%                    

    _______________________________________________________

Follow-up Comment #6:

I done a full re-write in Commit a6c61757668
<https://gitlab.com/makhlaghi/gnuastro-dev/-/commit/a6c61757668> (commit
message in P.S.) and look clear for the final steps now :-). In short, I
noticed that all the troubles we were dealing with (one point matching
multiple in the other) was already treated before in the sort-based matching.

As Ken Thompson says in The Art of Unix programming: "One of my most
productive days was throwing away 1000 lines of code."! This fit so well here
:-)!

Sachin, the job is almost done, I have put some next steps in the commit
message, so if you can implement them in followup commits, we can finalize
things and finish it off with the documentation. I am really eager to see the
result of the test to see from what point the k-d tree becomes faster than the
sort-based method and how it improves (maybe as a plot showing two curves).

P.S. Commit message: 


Library (match.h): Using the old infra-structure for double-matches

Until now, we hadn't considered the fact that the main body of work for
cleaning the final match (removing multiple nearby matches and formatting
of the output) was already implemented in the sort-based match!

With this commit, instead of trying to debug the steps in the last few
commits, I just moved everything into a temporary file (deleted them from
'lib/match.c') and added everything modeled based on sort-based matching
implementation: using the same 'bina' array. Once that was implemented, it
was really easy to just use the final functions of the sort-based matching
and everything seems to work properly now, with no segmentation
fault.

In those functions of the sort-based matching that are now also used in the
k-d tree based matching, the '_coordinate_' part is removed and they have
been moved to a separate (generic) part of 'lib/match.c' to highlight that
they are low-level and shared between both matching methods.

Generally, in programming, its always good to avoid re-writing a function
and modularize/reuse previously written functions. So for example if a bug
is found (or a feature is added) in the sort-based matching, it will
automatically be implemented in the k-d tree based matching too and no
extra work is necessary.

Some other minor points:

 - The 'inplace' argument to the sort-based matching was for the sorting
   step. It is not used in the k-d tree based matching, so I removed it
   from the arguments.

Things remaining:

 - Find a good name for the sort-based matching.

 - Complete all the expected features for k-d tree based matching and write
   tests for them during 'make check'.

 - Test to see which matching methodology is faster for how many rows. It
   is expected that for very few rows (like 10 or 100) probably the
   sort-based matching is faster, but from a cetain number of rows and
   certain number of threads, the k-d tree based matching will be
   faster. Once we find this, we will be able to implement the 'automatic'
   mode.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/task/?15803>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[task #15803] Match program builds k-d tree and later read from it, Mohammad Akhlaghi <=

Prev by Date: [task #15799] Add the autocomplete feature to gnuastro
Previous by thread: [task #15924] Auto-completion: Output Formatting
Index(es):
- Date
- Thread