[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: plists, alists, and hashtables

From: Pascal J. Bourguignon
Subject: Re: plists, alists, and hashtables
Date: Wed, 05 Aug 2015 23:41:03 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Ted Zlatanov <address@hidden> writes:

> That's all right, but I would probably prefer a Unicode pair of
> characters.  In 2015, that's going to inconvenience very few people.

Not on GUI, but people using emacs in (virtual) terminals are still
numerous, and there support for unicode, notably fancy characters, is
much more limited. 

Therefore I would advise to implement a unicode syntax only as an
additionnal alternative, not as the main one.

> So as a first cut, maybe «(k1 . v1) (k2 . v2)» and ««(k1 . v1) (k2 .
> v2)»» would be a good syntax (equal and eql versions respectively),
> simply converted to the appropriate #s(hash-table ...) syntax?

I would drop the dot.  And I fail to see a reason to keep the
parentheses too.

In my CL libraries, I have a hashtable function, like list or vector:

    (hashtable :test 'equal :size 1000
               :elements '((k1 v1)
                           (k2 v2)
                           (kn vn)))  

but we could lighten it, providing two functions similar to list or vector:

    (hash-table-eql k1 v1
                   k2 v2
                   kn v2)

    (hash-table-equal k1 v1
                      k2 v2
                      kn v2)

Notice that once you have those functions, you can use:

       #.(hash-table-eql :one 1 :two 2)

in CL, so you don't need a specific syntax, if you have only a few literals.

On Cocoa, the dictionaries can be created with class methods like these
constructor functions, alternating key and values (but finishing the
vararg list with an ugly nil!), and they also provide versions where you
pass two lists, of keys and values (well they reverse the order of value
and key for grammatical reasons, but this is irrelevant). 

So we could also have those functions:

   (make-hash-table-eql   (list k1 k2 k3)
                          (list v1 v2 v3))

   (make-hash-table-equal (list k1 k2 k3)
                          (list v1 v2 v3))

Once you have that, there's very little to be gained by adding a syntax
such as:

                   «k1 v1 k2 v2 k3 v3»

    (hash-table-eql k1 v1 k2 v2 k3 v3)

On the contrary, you introduce an ambiguity:

   (loop repeat 10 collect «k1 v1 k2 v2 k3 v3»)

will this allocate ten hash tables, or is it a list refering ten times
the same hash table?  (Actually this is the questions that renders me
incapable of programming in Ruby or Python, I never know what those
syntaxes do).

Is «k1 v1 k2 v2 k3 v3» a hash table, a vector, a string or what?

At least with:     (hash-table-eql k1 v1 k2 v2 k3 v3)

we see obviously that it's about eql hash-table, and the lisp
constructor pattern is clear.

> RT> If a hash-map type were used then there could be a bit to signify if
> RT> it's *really* a hash.  I've heard that some of the scripting languages
> RT> are doing this because they've found what I found; that most maps
> RT> contain so few elements that hashing just makes things slower.  But, the
> RT> bit would have to be tested before every map operation.
> I think the bit check performance penalty would be insignificant, and
> that the backend implementation can adapt to use lists for small maps.
> In any case, using these would be voluntary, people can still use the
> usual alists and plists.

Well, that's another problem :-)

- who knows whether hash-table already implement some kind of adaptative
  data structure or not?  You've checked the C sources?

- so, we wrap a-lists and hash-table in a dictionary abstraction.  Now
  people have the option between a-lists, p-lists or dictionaries, and
  in certain cases, vectors too (I encounter the situation where you
  want actually to maintain a vector and a hash-table in parallel in a
  common abstraction), so they will wrap a new map abstraction over
  those different implementation strategies again?

You can have a look at ACL2, the theorem prover written over Common
Lisp, using an abstract "subset/superset" of Common Lisp.  There are no
vectors, only lists (I don't remember the situation about hash-tables,
but probably they're not present, and you have to prove them using lists
(a-lists)).  On one hand, the use of a "rudimentary" data structure
like lisp lists allow them to provide a common abstracted API for all
sequence types, along with proven axioms and theorem on their usage, on
the other hand this doesn't prevent a compiler to translate the code
written in ACL2 into lower datastructures like vectors or dictionaries
(once the theorems have been proven) to extract performance, and on a
final hand, or you can just run the code on Common Lisp, and normal
lists will be used (but you know it'll be bug free, since it's proven).

The point here is that lisp lists and sexp provide a way to denote
abstract data structures that could have better performance
characteristics than just the lists it looks like.  It's basically up to
a "compiler" or a macro, to generate the better code from the high level

When you write the lisp form:      (hash-table-eql k1 v1 k2 v2 k3 v3)
you've actually written a list, 
but there is a function named hash-table-eql 
that turns it into a hash-table.

Now, you could also write:         (k1 v1 k2 v2 k3 v3)
and have that be turned into a hash-table,
because you would have written it in the context of a DSL or macro,
that would have proven or assumed it is hash-table data.

In both cases, the sexp notation with no supplementary syntaxes is
enough to denote your high level data structures. 

If you want to cater to the need of the future programmers, perhaps it
would be better to go toward making lisp a more high level programming
language, not adding syntax and specific distinct data structures, but
by having a smarter "eval" engine that is able to prove things about the
denoted data and programs, and to translate it into the most adapted
data structures and algorithms.

It's the same discussion we had about (1+ x) vs (+ 1 x).

1+ is supposedly more efficient (and also, pre-curried, for use with
mapcar).  But one can argue that (+ 1 x) is more general.  I doesn't
matter anymore, because nowdays the compiler will generate the same code
for both.

Similarly, instead of dicussing about #S(hash-table) vs «» vs
#.(hash-table …), it would be better to think about implementing theorem
prover features into a lisp to make it a higher level programming
language, so you don't have to care about what exact low level data
structure or algorithm is used, as long as the system has been able to
prove nice features of your high level description, which can be already
written with plain sexps.

__Pascal Bourguignon__       
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk

reply via email to

[Prev in Thread] Current Thread [Next in Thread]