swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Queston about Binary Search Trees


From: Theodore C. Belding
Subject: Re: Queston about Binary Search Trees
Date: Sun, 8 Nov 1998 15:02:12 -0500 (EST)

On Sun, 8 Nov 1998, Paul E. Johnson wrote:

> In the GNU C library, there is a set of functions--tsearch, tfind, twalk
> --that can be used to add nodes (pointers to objects) into a binary
> tree, and then searching for elements in that tree is  significantly
> faster than searching in a flat list (the difference is n-log(n), where
> n is the number of items in the tree (or list)).    

If you want your program to run on multiple Unix platforms, you might want
to be cautious about relying on non-standard C functions like these; not
all platforms have the GNU C library (glibc) installed.  But obviously,
that's a choice that you'll have to make yourself based on your own
priorities.  I believe that linux is currently the only platform that
commonly uses glibc. I might be wrong, but my impression is that glibc
isn't yet stable on all other platforms. Hopefully someone can correct me
if I'm wrong on this. 

It sounds like what you're looking for is what technically is known as a
multiset --- a set datatype that can hold multiple items with the same
key.  (Some references simply call this the "duplicate key" or "equal key" 
problem.) One way of handling this is to use a binary search tree and
simply not check for duplicate keys during the insertion operation.  For
example, duplicates might be stored to the right of items already in the
tree with the same key.  To find all items with a given key, simply keep
searching until a leaf node is reached (see Sedgewick, p. 507).  I don't
know if this works with the GNU C functions, since I'm not familiar with
them.  Marcus's solution is similar to this one, but it has the advantages
that it will work with any binary search tree implementation (also, you
can't insert the same individual twice by mistake.)

If it is important to you to be able to retrieve all items with a given
key easily and quickly, you might want to instead have each node in the
tree contain a linked list of all items with that node's key. (This is
analogous to a hash table with separate chaining, where conflicts are
resolved by inserting items with the same hash key into linked lists
(Sedgewick, p. 583).) Searching for a single item then becomes a search
for a given key, followed by a search down the appropriate linked list for
the specific item. This method is a bit more work to implement but
possibly more straightforward conceptually. 

Note that in all these solutions, both "search" and "delete" operations
are a bit ambiguous: you will have to decide whether you mean only the
first (or last) node found with the key, or all nodes with the key. 

If you'd like more information (with sample code) on binary search trees
and other data types, take a look at:
Sedgewick, Robert. (1998). Algorithms in C. 3rd ed. Parts 1-4.
Addison-Wesley.

Two more technical references are:
Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. (1994).
Introduction to Algorithms. MIT Press.

Knuth, Donald E. (1998). The Art of Computer Programming. vol. 3:
Sorting and Searching. 2nd ed. Addison-Wesley.

Hope that helps.
-Ted

--
Ted Belding                               address@hidden 
University of Michigan Program for the Study of Complex Systems
http://www-personal.umich.edu/~streak/




                  ==================================
   Swarm-Support is for discussion of the technical details of the day
   to day usage of Swarm.  For list administration needs (esp.
   [un]subscribing), please send a message to <address@hidden>
   with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]