bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hamt: New module.


From: Marc Nieper-Wißkirchen
Subject: Re: [PATCH] hamt: New module.
Date: Sat, 3 Apr 2021 20:59:03 +0200

Hi Bruno,

what do you think of the attempt attached below?

Thanks,

Marc

diff --git a/doc/containers.texi b/doc/containers.texi
index 15c915b93..35caf200c 100644
--- a/doc/containers.texi
+++ b/doc/containers.texi
@@ -35,6 +35,9 @@ log
 Gnulib provides several generic container data types.  They can be used
 to organize collections of application-defined objects.
 
+@node Ordinary containers
+@subsection Ordinary container data types
+
 @multitable @columnfractions .15 .5 .1 .1 .15
 @headitem Data type
 @tab Details
@@ -599,6 +602,46 @@ For C++, Gnulib provides a C++ template class for each of these container data t
 @tab @code{"gl_omap.hh"}
 @end multitable
 
+@node Specialized containers
+@subsection Specialized container data types
+
+The @code{hamt} module provides a persistant version of persistent hash
+array mapped tries (HAMTs).  A HAMT is an array mapped trie in which
+elements are stored according to the initial bits of their hash values.
+
+In the current implementation, each inner node of the HAMT can store up
+to @math{32 = 2^5} elements and subtries.  Whenever a collision between
+the initial bits of the hash values of two elements happens, the next
+@math{5} bits of the hash values are examined and the two elements
+pushed down one level in the trie.
+
+HAMTs have the same average access times as hash tables but grow and
+shrink dynamically, so they use memory more economically and do not have
+to be periodically resized.
+
+They were described and analyzed in @cite{Phil Bagwell (2000). Ideal
+   Hash Trees (Report). Infoscience Department, École Polytechnique
+   Fédérale de Lausanne.}
+
+HAMTs are well-suited to a persistent data structure, which means that
+each updating operation (like inserting, replacing, or removing an
+element) returns a new HAMT while leaving the original one intact.  This
+is achieved through structure sharing, which is even safe in the
+presence of multiple threads when the used C compiler supports atomics.
+
+A HAMT can be used whenever an ordinary hash table would be used.  It
+does however, provide non-destructive updating operations without the
+need to copy the whole container  On the other hand, a hash table is
+simpler so that its performance may be better when persistence is not
+needed.
+
+For example, a HAMT can be used to model the dynamic environment in a
+LISP interpreter.  Updating a value in the dynamic environment of one
+continuation frame would not modify values in earlier frames.
+
+To use the module, include @code{hamt.h} in your code.  The public
+interface is documented in that header file.
+
 @ifnottex
 @unmacro log
 @end ifnottex


Am Sa., 3. Apr. 2021 um 19:02 Uhr schrieb Bruno Haible <bruno@clisp.org>:
Hi Marc,

> +     hamt: New module.
> +     This module provides (persistent) hash array mapped tries.
> +     * MODULES.html.sh: Add hamt.
> +     * lib/hamt.c: New file.
> +     * lib/hamt.h: New file.
> +     * modules/hamt: New file.
> +     * modules/hamt-tests: New file.
> +     * tests/test-hamt.c: New file.

Would you like to write a bit of documentation for the Gnulib manual
about it? It does not need to copy the extensive comments in hamt.h.
It need only answer the questions:
  - What is a HAMT?
  - When would I (a programmer) make use of it? How does it compare
    to other container data types?
  - What is the Gnulib module name and the include file name?

I think it would fit into the section "Container data types"
https://www.gnu.org/software/gnulib/manual/html_node/Container-data-types.html .

Find attached the start of a diff to add this documentation.

Bruno

reply via email to

[Prev in Thread] Current Thread [Next in Thread]