[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] ANN: new eggs "llrb-syntax" and "llrb-tree"

From: Jörg F. Wittenberger
Subject: Re: [Chicken-users] ANN: new eggs "llrb-syntax" and "llrb-tree"
Date: Mon, 26 Oct 2015 12:43:41 +0100
User-agent: Mozilla/5.0 (X11; Linux armv7l; rv:31.0) Gecko/20100101 Icedove/31.7.0

Hi Marc,

Am 25.10.2015 um 19:42 schrieb Marc Feeley:
> Interesting… it looks a lot like the define-rbtree macro used in the Gambit 
> runtime system to implement priority queues for the scalable thread scheduler.
> Is there any historical link between your implementation and the Gambit one?
> Marc

Yes, there is.  Documented in the archives of this mailing list from
2008.  Search for "Feeley priority queue".

I studied the rb-tree code as well as weight balanced trees and skip
lists.  And tried them in the scheduler.

Eventually I settled with llrb.  The latter especially because the
implementation as a syntax would allow me to produce lowlevel chicken
code and have threads with slots for two independent queues timeout and
blocking reason (directly in the thread object, not as two separate
references to some intermediate queue entry node).  And there was this
license issue blocking inclusion into chicken.

The archives even reflect my now proven false assumption that the
allocation free version would be faster.  More work to do...  ;-)

But more important than the use as a priority queue I found the use case
as alists replacement.  Especially when the miss is not an error but
handled.  I recall having seen this suspiciously often.



>> On Oct 24, 2015, at 2:57 PM, Jörg F. Wittenberger <address@hidden> wrote:
>> Hi all,
>> attached the code of two new eggs.
>> llrb-syntax contains a syntax-rules implementation of left-leaning
>> red-black trees.  It expands code maintaining LLRB trees in a data
>> structure defined by the user.  Depending at the use case it will either
>> allocate temporary space (in the pure case) or allocate no temporary
>> space at all and use the minimum amount of mutations to maintain the
>> LLRB tree in place.
>> llrb-tree uses llrb-syntax to implement generic alternatives for
>> assoc-list drop-ins (having a much better lookup behavior) and srfi-69
>> hash table drop-ins.  Plus versions specialized for fixnums, strings and
>> symbols as key.
>> Both have a README, which should (hopefully) be ready to be pasted into
>> the wiki.  (Can't test the syntax, but hey, it's not that much I can
>> have done wrong.)
>> As I don't see the point in hosting my own infrastructure to publish
>> them (after all I assume that once I'm past the inevitable bugs an
>> shortcomings those will not see frequent updates by nature), I'd prefer
>> if those would eventually go into the CHICKEN projects repository.
>> (Where I only have r/o access so far.)
>> ---
>> While preparing I learned something I want to share.  CHICKEN is always
>> good for a surprise.  And I'm astonished to say the least.  Something is
>> strange here.
>> 1)  Intuitively I'd assume that conserving space to reduce the load on
>> the garbage collector should pay off.  Even for CHCIKEN, which is
>> supposed to be highly effective on temporary allocations.  The numbers
>> (included in "numbers.txt" in the tarball, though those should probably
>> not make it into svn) did prove me wrong.  The version specialized to
>> fixnum keys is roughly 4x slower when running allocation free (thus
>> loading the mutation stack) than the pure version.  The same I found to
>> be true for mutating versions having string-keys.
>> 2) I never had the idea to actually use LLRB trees as a *replacement*
>> for hash tables.  After all LLRB maintains an order relation and is
>> according to theories and urban legends *supposed* to be slower.  See
>> yourself: in the fixnum case, llrb is almost twice as fast as the
>> corresponding hash table.  (1M inserts in 90 seconds for Hashtable vs.
>> 46 seconds for LLRB; References are even faster 1M refs in 19sec. vs.
>> 10sec. for LLRB.)
>> I'm astonished.
>> Even the LLRB build using generic = and < is almost a fast and much
>> faster than Hashtables.
>> 3)  Things become slightly more confusing when taking symbols as key.
>> For those hash tables appear about 30% faster than LLRB on inserts
>> and about 5% faster on referencing.  (However randomization, which is
>> commented out in the attached source was still active during the test.
>> Maybe there's something hidden.)
>> From memory: before I tested with smaller numbers of keys, which
>> resulted in the opposite finding: LLRB was found relative faster for
>> smaller numbers of keys.  After all it's O(log n), thus this is expected.
>> 4) string->symbol puts #3 upside down:
>> string->symbol 1000000 calls in 111935.0 ms
>> caching the result in a LLRB-string tree naturally adds overhead on the
>> first call: str2sym 1000000 refs in 185748.0ms – almost twice.
>> Wait, what? *Almost* twice?  How comes?  Should have been around 230%
>> according to #3.
>> But on the second call string->symbol still takes those 112403.0ms,
>> while the LLRB-cached version needs only 16321.0ms.
>> I guess something is wrong in string->symbol.  So far I've been unable
>> to spot it.
>> But who would have predicted that a balanced tree could be sorta faster
>> than a hash table?  Me not.
>> Have a nice day!
>> /Jörg
>> PS: there's a tblbench.scm in the tarball as well, which produced those
>> numbers.  After all that's kind of a benchmark and if there's something
>> lying to me, then it's probably benchmark.
>> <llrb.tar.gz>_______________________________________________
>> Chicken-users mailing list
>> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]