[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: plists, alists, and hashtables

From: Pascal J. Bourguignon
Subject: Re: plists, alists, and hashtables
Date: Wed, 05 Aug 2015 19:24:31 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Ted Zlatanov <address@hidden> writes:

> On Wed, 05 Aug 2015 08:12:22 +0200 "Pascal J. Bourguignon" <address@hidden> 
> wrote: 
> PJB> What you are losing from sight is the fact that:
> PJB> - a-lists are lists
> PJB> - p-lists are lists
> PJB> - lists are sequences
> PJB> - lists are cons cells or nil
> PJB> therefore any operator working on cons cells, on sequences, on lists,
> PJB> can also work on p-lists and on a-lists.
> Yes, I think the implicit advantage of everything being a list is well
> understood amongst us.  But so is the disadvantage of treating
> everything as a list.  The question is whether hashtables, an existing
> ELisp map data type, could become more popular.

How would that be good?

Seriously, hash-tables have a lot of drawbacks.
They use much more memory,
they are much slower (on small dictionaries),
they are much restrictive on the possible key equivalence function.

The only advantage they have, is on speed of access in big dictionaries.

But even when you need a O(1) access on a big dictionary, you will find
you keep converting between hash-table and lists or vectors, of only to
sort the entries out to present them to the user!

If you achieved your goal of having unwashed masses use hash-tables
instead of a-list/p-lists, the only result you'd attain would be to slow
down emacs.  Run your own benchmark.  On my computer, I notice that
until the size of the dictionary reaches 20-30, a-lists are performing
better than hash-table. (And even, I call assoc on set, while for
a-list it is customary to just do acons, so inserting or reseting
entries would be even much faster). 

;;; -*- mode:emacs-lisp;lexical-binding:t;coding:utf-8 -*-
(setf lexical-binding t)

(defun make-hash-table-dictionary ()
  (let ((table (make-hash-table)))
    (lambda (m &optional k v)
      (ecase m
        (get (gethash k table))
        (del (remhash k table))
        (set (setf (gethash k table) v))
        (key (let ((keys '()))
               (maphash (lambda (k v)
                          (declare (ignore v))
                          (push k keys))

(defun make-a-list-dictionary ()
  (let ((table '()))
    (lambda (m &optional k v)
      (ecase m
        (get (cdr (assoc k table)))
        (del (let ((prev-entry (loop
                                 for cell on (cons nil table)
                                 until (eql k (cadr cell))
                                 finally (return cell))))
               (when prev-entry
                 (setf (cdr prev-entry) (cddr prev-entry)))
               (setf table (cdr prev-entry))))
        (set (let ((entry (assoc k table)))
               (if entry
                   (setf (cdr entry) v)
                   (setf table (acons k v table)))))
        (key (mapcar (function car) table))))))

(defun make-null-dictionary ()
  (lambda (m &optional k v)
    (declare (ignore k v))
    (ecase m
      ((get del set key) nil))))

(defun dict-get (dict k)   (funcall dict 'get k))
(defun dict-del (dict k)   (funcall dict 'del k))
(defun dict-set (dict k v) (funcall dict 'set k v))
(defun dict-key (dict)     (funcall dict 'key))

(defun get-internal-real-time ()
  (destructuring-bind (high low microsec &rest ignored) (current-time)
    (declare (ignore ignored))
    (+ (* high 65536.0) low (* 1e-6 microsec))))

(defmacro chrono (&rest body)
  "Returns the time it took to evaluate the body expressions (in an implict 
  (let ((start (gensym))
        (result (gensym)))
    `(let* ((,start  (get-internal-real-time))
            (,result (progn ,@body)) )
       (- (get-internal-real-time) ,start))))

(defun generate-keys (size)
  (loop repeat size
        collect (gensym)))

(defun generate-values (size)
  (loop repeat size
        with max = (* size 1000)
        collect (random size)))

(defun benchmark (constructor size repeat)
  (let* ((dicts            '())
         (keys             (generate-keys size))
         (values           (generate-values size))
         (constructor-time (chrono (setf dicts (loop repeat repeat
                                                     collect (funcall 
         (fill-time        (chrono (mapc (lambda (dict)
                                           (loop for k in keys for v in values
                                                 do (dict-set dict k v)))
         (key-time-full    (chrono (mapc (function dict-key) dicts)))
         (get-time-full    (chrono (mapc (lambda (dict)
                                           (loop for k in keys for v in values
                                                 collect (eql v (dict-get dict 
         (del-time         (chrono (mapc (lambda (dict)
                                           (loop for k in keys for v in values
                                                 for i from 0
                                                 when (oddp i)
                                                   do (dict-del dict k)))
         (key-time-part    (chrono (mapc (function dict-key) dicts)))
         (get-time-part    (chrono (mapc (lambda (dict)
                                           (loop for k in keys for v in values
                                                 for i from 0
                                                 if (oddp i)
                                                   collect (null (dict-get dict 
                                                   collect (eql v (dict-get 
dict k))))
    (let ((count (float repeat)))
      (list :constructor-time (/ constructor-time count)
            :fill-time        (/ fill-time        count)
            :key-time-full    (/ key-time-full    count)
            :get-time-full    (/ get-time-full    count)
            :del-time         (/ del-time         count)
            :key-time-part    (/ key-time-part    count)
            :get-time-part    (/ get-time-part    count)))))

(defun benchmark-all ()
  (loop with repeat = 100
        for size in '(1 2 3 4 5 6 7 8 9 10 15 20 25 30 35 40 50 60 70 80 90 100)
        collect (loop
                  with null-case = (benchmark 'make-null-dictionary size repeat)
                  for constructor in '(make-a-list-dictionary 
                  collect (list* constructor size
                                 (mapcar* (lambda (v b)
                                            (if (numberp v)
                                                (- v b)
                                          (benchmark constructor size repeat)

;; (benchmark-all)

> PJB> Unfortunately there are many more than 8 datastructures for which you
> PJB> could want a specific syntax. 
> You have to be specific about why we'd discuss or want these 8 data
> structures, since the discussion was only about hashtables.

Nonetheless, this argument is the reason why the print/read syntax for
hash-tables is:

> #s(hash-table size 65 test eql rehash-size 1.5 rehash-threshold 0.8 data ())

This extends easily to any other data structure for which you'd want a
literal representation.

__Pascal Bourguignon__       
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

reply via email to

[Prev in Thread] Current Thread [Next in Thread]