chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] CHICKEN hang / crash / memoize egg


From: Peter Bex
Subject: Re: [Chicken-users] CHICKEN hang / crash / memoize egg
Date: Fri, 1 Apr 2016 20:32:45 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

On Fri, Apr 01, 2016 at 03:28:26PM +0100, Andy Bennett wrote:
> Hi Peeps!
> 
> I'm running CHICKEN 4.9.0rc1 and I'm trying out the memoize egg.
> 
> I saw the tail-call-optimized version of the factorial procedure in the
> memoize documentation at http://api.call-cc.org/doc/memoize and I've
> been trying to modify it so that it memoizes intermediate results such
> that, for example, a call to (fact 10) makes a subsequent call to (fact
> 9) fast.
> 
> Sometimes I'll crash immediately and someytimes I have to press some
> keys when the "#;8>" prompt appears.
> 
> Is this a known bug in Chicken 4.9.0rc1 or the srfi-69 hash-tables that
> memoize uses?

There are two problems.  In your code you don't limit the size of the
memoized hash table, which means it won't ever delete anything from its
hash tables.

The other problem seems to be in the memoize egg: there, even if you do
set a limit, the check for the hash table size occurs at precisely the
wrong place in the recursion.  It checks the size, then calls the
procedure and when it returns, the result is stored in the hash table.

Let's say we put a limit of 1.  Then we call (memoized-fact 3).
It will determine (3) doesn't exist yet, and that the cache size is 0.
Then it will actually call (original-fact 3), which calls memoized-fact
recursively as (memoized-fact 2).  In the recursion, the cache is still
empty and there's no entry for (2).  So it calls (original-fact 2) to
determine the result to store.  This calls (memoized-fact 1), which
enters with an empty cache and (1) doesn't exist yet.  Then it calls
(original-fact 1), which returns with 1 immediately.

When it returns, the result is stored in the hash table for args = (1).
This returns its value to the pending continuation which will
calculate 2, which is then stored in the cache for (2) and returned to
the pending continuation which will calculate our final result 6, which
is stored in the cache for (3) and then finally returned.

After all this, the cache's size is 3 even though we capped it to a
limit of only 1 item!  Also, the numbers you're storing _are_ really
large (the final result requires over 59KiB to store the number), and
because all the intermediate results are retained because the cache
isn't ever cleared, it's no wonder you're running out of heap size.

The attached patch fixes the problem, as long as you do set a limit,
using (define memo-fact** (memoize fact** 10)).  In fact, you can set
a limit of 1 if you want; the outermost recursion step is the one that
will never be deleted.

A proper test suite should:

a) ensure the procedure is only called as many times as necessary to
   generate new values for uncached values.  Fibonacci would be great
   for measuring this: it should be called n times instead of n^2 times.
b) ensure that the cache limit is obeyed.  This is trickier to write a
   test for.  Maybe something that is called in a sequence that looks
   like 1, 1, 2, 2, 3, ... n, n, ... 3, 3, 2, 2, 1, 1 would be useful:
   in that case it should be called exactly 2n times if the limit is < n,
   unless I'm wrong.  You'd then have to make a few variants of the
   memoized procedure, with limits around the edge cases n, n-1, n+1
   and check if the original procedure is called 2n times or 4n times.

I'm sure smarter people than myself can come up with a better way to
test this.

Cheers,
Peter

Attachment: 0001-Fix-caching-bug-which-caused-the-limit-to-be-ignored.patch
Description: Text Data

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]