[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] ANN: new eggs "llrb-syntax" and "llrb-tree"

From: Marc Feeley
Subject: Re: [Chicken-users] ANN: new eggs "llrb-syntax" and "llrb-tree"
Date: Mon, 26 Oct 2015 08:20:56 -0400

Sorry, I had forgotten our email exchanges on the subject 7 years ago!

I’ll have to look into your implementation because I don’t understand why the 
allocation free version is slower.


> On Oct 26, 2015, at 7:43 AM, Jörg F. Wittenberger <address@hidden> wrote:
> Hi Marc,
> Am 25.10.2015 um 19:42 schrieb Marc Feeley:
>> Interesting… it looks a lot like the define-rbtree macro used in the Gambit 
>> runtime system to implement priority queues for the scalable thread 
>> scheduler.
>> Is there any historical link between your implementation and the Gambit one?
>> Marc
> Yes, there is.  Documented in the archives of this mailing list from
> 2008.  Search for "Feeley priority queue".
> I studied the rb-tree code as well as weight balanced trees and skip
> lists.  And tried them in the scheduler.
> Eventually I settled with llrb.  The latter especially because the
> implementation as a syntax would allow me to produce lowlevel chicken
> code and have threads with slots for two independent queues timeout and
> blocking reason (directly in the thread object, not as two separate
> references to some intermediate queue entry node).  And there was this
> license issue blocking inclusion into chicken.
> The archives even reflect my now proven false assumption that the
> allocation free version would be faster.  More work to do...  ;-)
> But more important than the use as a priority queue I found the use case
> as alists replacement.  Especially when the miss is not an error but
> handled.  I recall having seen this suspiciously often.
> Best
> /Jörg
>>> On Oct 24, 2015, at 2:57 PM, Jörg F. Wittenberger <address@hidden> wrote:
>>> Hi all,
>>> attached the code of two new eggs.
>>> llrb-syntax contains a syntax-rules implementation of left-leaning
>>> red-black trees.  It expands code maintaining LLRB trees in a data
>>> structure defined by the user.  Depending at the use case it will either
>>> allocate temporary space (in the pure case) or allocate no temporary
>>> space at all and use the minimum amount of mutations to maintain the
>>> LLRB tree in place.
>>> llrb-tree uses llrb-syntax to implement generic alternatives for
>>> assoc-list drop-ins (having a much better lookup behavior) and srfi-69
>>> hash table drop-ins.  Plus versions specialized for fixnums, strings and
>>> symbols as key.
>>> Both have a README, which should (hopefully) be ready to be pasted into
>>> the wiki.  (Can't test the syntax, but hey, it's not that much I can
>>> have done wrong.)
>>> As I don't see the point in hosting my own infrastructure to publish
>>> them (after all I assume that once I'm past the inevitable bugs an
>>> shortcomings those will not see frequent updates by nature), I'd prefer
>>> if those would eventually go into the CHICKEN projects repository.
>>> (Where I only have r/o access so far.)
>>> ---
>>> While preparing I learned something I want to share.  CHICKEN is always
>>> good for a surprise.  And I'm astonished to say the least.  Something is
>>> strange here.
>>> 1)  Intuitively I'd assume that conserving space to reduce the load on
>>> the garbage collector should pay off.  Even for CHCIKEN, which is
>>> supposed to be highly effective on temporary allocations.  The numbers
>>> (included in "numbers.txt" in the tarball, though those should probably
>>> not make it into svn) did prove me wrong.  The version specialized to
>>> fixnum keys is roughly 4x slower when running allocation free (thus
>>> loading the mutation stack) than the pure version.  The same I found to
>>> be true for mutating versions having string-keys.
>>> 2) I never had the idea to actually use LLRB trees as a *replacement*
>>> for hash tables.  After all LLRB maintains an order relation and is
>>> according to theories and urban legends *supposed* to be slower.  See
>>> yourself: in the fixnum case, llrb is almost twice as fast as the
>>> corresponding hash table.  (1M inserts in 90 seconds for Hashtable vs.
>>> 46 seconds for LLRB; References are even faster 1M refs in 19sec. vs.
>>> 10sec. for LLRB.)
>>> I'm astonished.
>>> Even the LLRB build using generic = and < is almost a fast and much
>>> faster than Hashtables.
>>> 3)  Things become slightly more confusing when taking symbols as key.
>>> For those hash tables appear about 30% faster than LLRB on inserts
>>> and about 5% faster on referencing.  (However randomization, which is
>>> commented out in the attached source was still active during the test.
>>> Maybe there's something hidden.)
>>> From memory: before I tested with smaller numbers of keys, which
>>> resulted in the opposite finding: LLRB was found relative faster for
>>> smaller numbers of keys.  After all it's O(log n), thus this is expected.
>>> 4) string->symbol puts #3 upside down:
>>> string->symbol 1000000 calls in 111935.0 ms
>>> caching the result in a LLRB-string tree naturally adds overhead on the
>>> first call: str2sym 1000000 refs in 185748.0ms – almost twice.
>>> Wait, what? *Almost* twice?  How comes?  Should have been around 230%
>>> according to #3.
>>> But on the second call string->symbol still takes those 112403.0ms,
>>> while the LLRB-cached version needs only 16321.0ms.
>>> I guess something is wrong in string->symbol.  So far I've been unable
>>> to spot it.
>>> But who would have predicted that a balanced tree could be sorta faster
>>> than a hash table?  Me not.
>>> Have a nice day!
>>> /Jörg
>>> PS: there's a tblbench.scm in the tarball as well, which produced those
>>> numbers.  After all that's kind of a benchmark and if there's something
>>> lying to me, then it's probably benchmark.
>>> <llrb.tar.gz>_______________________________________________
>>> Chicken-users mailing list
>>> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]