Ah right, I believed the update! would add the element to the list destructively. After changing the second variant to include the newly created element when not found. The alist gets much more slower than the hash-table.
Now it makes much more sense. The second variant was only fast because it was searching an empty list. I changed the second variant to loop with the returned value of “alist-update!” and it is much more slower now.
I can’t really argue why alist-update and alist-update! were designed but I was really expecting the element to get automatically added to the list with the destructive version. But I guess for speed reasons, it makes sense to just return a (cons element lst) instead of adding it. The doc should probably be more clear about that.
That said, the hash-table makes much more sense to use now.
Loïc Faure-LacroixSent with Airmail
On November 12, 2013 at 2:48:56 PM, Peter Bex (address@hidden) wrote:
On Tue, Nov 12, 2013 at 02:33:23PM +0400, Loïc Faure-Lacroix wrote:
> For a tessellation function, I believe I should use a hash table or a alist to save the index of some points to prevent duplicates. Yesterday I felt I should test how fast would the hash-table work to index my 3d coordinates. For that reason, I wrote that small benchmark and realized that the destructive function on alist gives impressive speed agains the non destructive loop. The hash table gives results that are close to the alist (destructive).
> To use the alist, I’d have to implement a hashing function which could then give results pretty similar to the hash table. But the non destructive alist scares me a little.
> To do the same job, it takes 650s for the non destructive function and 0.014s for the use of destructive “alist-update!”. Anyone can explain why is this so slow?
> Also what is the big difference between alist and a hash-table other than alist doesn’t seem to hash objects. After testing again, I realize that “alist-update!” isn’t even adding new element to the current list.
alist-update! will simply set the cdr if it finds it, otherwise cons it
onto the front of the list. That makes it O(n) to locate the key, and
O(1) to update/insert.
alist-update will take O(n) to locate the key just like alist-update!,
but when it finds the entry, it will need to build a new list with
the entry replaced at the same position. That means it's O(n) to update
(for new keys it's O(1), they can just be consed onto the front).
In total, that's O(2n). The major extra time you see can not be
explained due to that, but the output of time gives a clear hint.
The performance hit is due to the extra garbage this creates: if
you need to replace the last item in a list of length N, it will need
to create N new pairs, thereby ensuring that all the pairs of the whole
original list are now garbage to be collected. This puts a lot of
pressure on the GC.
By the way: you can gain another performance boost by not using alist-ref
twice, but caching the old value in a LET.