[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
intervals.c: About interval tree
From: |
Jambunathan K |
Subject: |
intervals.c: About interval tree |
Date: |
Mon, 15 Apr 2013 10:08:05 +0530 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) |
I have been closely looking at intervals.c and I have a few questions.
Type of tree (in standard literature)?
=====================================
It seems the tree that is used is "mostly balanced tree". I am
wondering what sort of "balancedness" we are talking about i.e.. what
invariant does the tree satisfy?
A cursory reading of `balance_an_interval' suggests that, at a given
node, choose one of
1. No rotation
2. Left rotate
3. Right rotate
which minmizes the diff between weight of left and right subtrees i.e.,
(weight => total length of the interval)
Is that balancing rule a heuristic or does it give nice algorithmic
properties?
----------------------------------------------------------------
Similar to (but not same as) this?
=================================
The closest (in form) to what current `balance_an_interval' does is the
one that I see from
http://www8.cs.umu.se/kurser/TDBAfl/VT06/algorithms/WEBSITE/IMPLEMEN/HANDBOOK/DISTRIB/HANDBOOK/ALGS/32/3415_R_C.HTM
But the code snippet there does *some more jugglery* beyond merely
comparing the left and right weights.
----------------------------------------------------------------
Is history relevant - Apropos threshold for weight balancing?
=============================================================
The weight balanced trees that I know of define a threshold that goes
something like what is seen here in the below pages.
http://xlinux.nist.gov/dads//HTML/bbalphatree.html
http://xlinux.nist.gov/dads//HTML/balance.html
(The first link points to code examples for checking balancedness.)
The very initial versions of intervals.c, has this
,----
| /* Factor for weight-balancing interval trees. */
| Lisp_Object interval_balance_threshold;
`----
and balancing was indeed taking this form
,----
| register int total_children_size = (LEFT_TOTAL_LENGTH (i)
| + RIGHT_TOTAL_LENGTH (i));
| register int threshold = (XFASTINT (interval_balance_threshold)
| * (total_children_size / 100));
|
| if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
| && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
| return rotate_right (i);
|
| if (LEFT_TOTAL_LENGTH (i) > RIGHT_TOTAL_LENGTH (i)
| && (LEFT_TOTAL_LENGTH (i) - RIGHT_TOTAL_LENGTH (i)) > threshold)
| return rotate_right (i);
`----
but the threshold removed as part of this commit
,----
| revno: 5416
| committer: Richard M. Stallman <address@hidden>
| timestamp: Sun 1994-01-02 19:01:15 +0000
`----
I think the above re-working is part of the story that jwz relates in
,---- RMS words referred to in http://www.jwz.org/doc/lemacs.html
| But mainly we didn't even get to this stage. Problems became
| visible at the general design level.
|
| During that conversation I found out about the problem with
| slowness of interval processing. I called Arceneaux and looked at
| his code, and found a localized bug in the balancing of binary
| trees. If you created all the intervals from front to back, it
| would do no balancing, leaving the tree maximally unbalanced. The
| effect was a rather roundabout linear search. As luck would have
| it, the Lucid application usually created intervals from front to
| back, so they always saw the worst possible behavior. Anyway, the
| bug was simple once found. This all took a few days.
`----
- intervals.c: About interval tree,
Jambunathan K <=