[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] add SRFI: srfi-121; generators

From: Shiro Kawai
Subject: Re: [PATCH] add SRFI: srfi-121; generators
Date: Fri, 22 Jan 2021 16:14:44 -1000

Hi Mark,
I agree that the whole point of generators is performance.  When combined with lazy sequences (srfi-127), it covers many typical use cases of streams much more efficiently.
I've written Gauche version with performance in mind, but it depends on a few non-standard features and can not be used as-is.  Besides, if we do go for performance, we need benchmark suites as well.  I'm interested in adopting Gauche version as the reference implementation but don't know when I have time.  If you have some code to improve the current reference implementation and can merge it to the repo, we can proceed gradually, I guess.


On Fri, Jan 22, 2021 at 2:59 PM Mark H Weaver <> wrote:
Hi John,

John Cowan <> writes:

> Mark: I'm interested to know if you have a sketch of ideas for a more
> efficient implementation of SRFI 121/158.  You say it requires specific
> knowledge of Guile internals, but are you willing to sketch how you might
> do it?  Thanks.

Here are a few suggestions off the top of my head, looking only at the
latest SRFI-121 reference implementation:

* In 'gcombine', 'generator-fold', 'generator-for-each', and possibly
  also 'generator-unfold', it would be helpful to use 'case-lambda' to
  provide specialized code for cases where the 'dotted' tail in the
  argument list consists of 1 or 2 arguments.  When a procedure with a
  dotted tail is called, it forces allocation of a new list, and last I
  checked Guile does not include optimizations to avoid that allocation
  where possible.  Worse, the general case requires calling 'map' and
  allocating a new list every time a new item is requested.  It's a
  great help to avoid these expenses in the most common cases.  For
  example, see the definitions of 'map', 'for-each', and 'fold' in

* Avoid using 'define-values' in internal lexical contexts in Guile for
  now.  Hopefully some day Guile's implementation of 'define-values'
  will be more efficient, but for now it's implemented as a simple macro
  that expands into code that mutates variables, which prevents several
  other optimizations that could otherwise be done by Guile's compiler.
  In particular, in 'gcombine', better use 'call-with-values' or

* Several procedures are defined in terms of more general higher-order
  procedures, or create intermediate lists/generators unnecessarily, for
  the sake of making the code simpler.  In most contexts I would applaud
  this practice, but there will be a significant price to pay in
  efficiency.  For example, 'generator->reverse-list' with 1 argument is
  implemented in terms of 'generator-fold', and with 2 arguments by
  creating an intermediate generator using 'gtake'.

* To make matters worse: 'gtake', as well as 'make-for-each-generator'
  and 'make-unfold-generator', are defined in terms of coroutine
  generators.  It's good to have coroutine generators available, but
  they are quite expensive, and best avoided where efficiency is

* 'generator->vector' is defined using 'generator->list' and
  'list->vector'.  That's bad enough, but digging deeper we see that
  'generator->list' is implemented by reversing the result of
  'generator->reverse-list', which as I've already pointed out is
  defined using 'generator-fold', which currently involves calling 'map'
  and 'append' and allocating temporary lists for each item generated.

In general, it's helpful to go through the mental exercise of tracing an
evaluation of each of these procedures, and more importantly to trace
calls to the produced generators, to get some idea of the expense

This is just a sampling of the problems I noticed from a quick skim of
the code, by no means comprehensive.

I acknowledge that following the suggestions above will make the code
much larger, uglier, and more difficult to understand.  I would not
recommend such practices except in cases where efficiency is paramount.

In this case, I think efficiency *is* paramount.  My rationale is that,
like hash tables, generators inevitably force code that uses them to be
written in an imperative style, and therefore they are best avoided in
my opinion, *except* in cases where efficiency is paramount.

To avoid being forced to write code in an imperative style, I suggest
that in most cases streams are preferable to generators.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]