[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multiple values passed as single argument to procedure

From: Mark H Weaver
Subject: Re: Multiple values passed as single argument to procedure
Date: Mon, 12 Jun 2017 22:26:12 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

David Kastrup <address@hidden> writes:

> Mark H Weaver <address@hidden> writes:
>> I'm sorry David, but _everything_ that you wrote below is incorrect.
> Well, let me try again.  It's not all that easy to understand.

You're obviously quite intelligent and knowledgeable, so it's probably
my failure to explain it well more than anything else.  We can try again

>> David Kastrup <address@hidden> writes:
>>> Chris Marusich <address@hidden> writes:
>>>> I think I'm missing something here.  In (list (f)), the call to f
>>>> certainly looks like it's happening at a position that one might
>>>> intuitively call a "tail" position.
>>> It is,
>> No it isn't.

The reason that (f) is not in tail position with respect to (list (f))
is because there is still something left to do after calling (f) but
before invoking the continuation of (list (f)).  The thing remaining to
do is to apply 'list' to the returned value.

Putting it another way, if (f) evaluates to 5, you cannot proceed by
reducing (list (f)) to 5.  The continuation of (f) cannot be the same as
the continuation of (list (f)), because that would skip the step of
calling 'list'.

* * *

It might be helpful to be more precise about what continuations are.  To
a first approximation, a continuation can be visualized as an expression
with a hole.  For example, in the expression (list (f) (f) (f)), suppose
that the compiler chooses to evaluate the second argument first.  Then
the continuation of the middle (f) is:

  (list (f) <> (f))

where <> is the hole.  When 'f' returns its value (i.e. invokes its
continuation), what remains to be done is this: evaluate the expression
which is formed by putting the return value into the hole.  So, if that
middle (f) returns 5, then we proceed to evaluate:

  (list (f) 5 (f))

If the compiler chooses to evaluate the first argument next, then the
continuation for the first (f) will be:

  (list <> 5 (f))

and so on.  To make this a bit more precise, we could actually use
normal Scheme procedures to model these continuations.  The continuation
above could be written like this:

  (lambda (x)
    (list x 5 (f)))

Supposing that (f) returns 6 this time (maybe it's a counter?), the
continuation of the last (f) will be:

  (lambda (z)
    (list 6 5 z))

In fact, in Scheme we can actually write code in such a way that makes
these continuations explicit.  Here's how we do it: we add an extra
argument to every procedure: its continuation.  Instead of returning
values from procedures, we explicitly call the continuation, passing the
return value as its argument.

If you do this until every procedure call is in tail position, the
resulting program is in "continuation passing style" or CPS.

Here's what (lambda () (list (f 1) (f 2) (f 3))) looks like in CPS,
using the same evaluation order as I chose above:

  (lambda (outer-k)
    (f^ 2 (lambda (y)
            (f^ 1 (lambda (x)
                    (f^ 3 (lambda (z)
                            (list^ x y z outer-k))))))))

Where 'f^' and 'list^' are variants of 'f' and 'list' that accept an
explicit continuation argument.

Continuation passing style makes more things explicit.  It forces us to
explicitly choose an evaluation order, and it entails assigning names to
every intermediate value.  The new variables x, y, and z will hold the
intermediate values returned by (f 1), (f 2), and (f 3), respectively.

So far, I've shown only unary continuations, but when a program is in
continuation passing form, it becomes trivial to "return" an arbitrary
number of values by simply adding more arguments to the continuations.

For example, we could modify the CPS example above to receive two values
from 'f', and to return a list of all six results:

  (lambda (outer-k)
    (f^ 2 (lambda (y1 y2)
            (f^ 1 (lambda (x1 x2)
                    (f^ 3 (lambda (z1 z2)
                            (list^ x1 x2 y1 y2 z1 z2 outer-k))))))))

In CPS, this is a very straightforward, and requires no additional
syntax or primitives.  However, in "direct style" (i.e. the normal way
we write programs) we cannot add multiple return values so easily.  The
difficulty is essentially syntactic in nature.

Here's one way to do the same job as above in direct style, and making
the order of argument evaluation unspecified:

  (lambda ()
    (let-values (((x1 x2) (f 1))
                 ((y1 y2) (f 2))
                 ((z1 z2) (f 3)))
      (list x1 x2 y1 y2 z1 z2)))

compared with the earlier version that keeps only one argument from each
call to 'f'.

  (lambda () (list (f 1) (f 2) (f 3)))

Notice how similar the two CPS examples are, while the corresponding
programs in direct style are strikingly different.

>>> but list does not take multiple values
>> Yes it does.  'list' accepts an arbitrary number of values
>> (arguments).
> In my book, multiple values/arguments are different things

As you can see above, in CPS, they are *exactly* the same thing.  And in
fact Guile 2.2 transforms programs into CPS as an integral part of
compilation before code generation.

> but I admit that this is a quagmire to me and strictly speaking what I
> call "multiple values" (namely the unmitigated return value from
> calling values)

Your statement above, where you call "multiple values" "the unmitigated
return value" (singular) reveals your confusion.  I suspect you are
still thinking of multiple value returns as if the values are being
packaged up into a single object which is returned and then taken apart
by the caller.  This is a mistake.  Nothing like this is happening
inside Guile's VM.

I suspect that you do not have any trouble envisioning multiple
arguments being passed *to* a procedure without packaging them into a
single object.  You surely know that in C, the arguments are stored in
separate registers, or placed in separate entries on the stack, before
jumping to a C function.  Something similar can be done to return
multiple values, if the calling convention allows it.  Guile's does.

If you free your mind from the biases of direct style and C calling
conventions, there's a remarkable symmetry between passing arguments
*to* a procedure and returning values *from* a procedure.  The apparent
lack of symmetry is purely a syntactic artifact of the "direct style"
that we generally prefer to work with.

>>> and thus discards additional values returned by f.
>> 'list' does not discard anything.  The additional values are discarded
>> before 'list' is called.
> Ok.  Because a function call cannot take multiple values.  It does take
> multiple _arguments_.

Let's take another look at Chris's example:

  (define (f . _) (values 1 2))
  (list (f)) => (1)

If you expected (1 2), then consider this:

  (list (f) (f) (f)) => (1 1 1)

Would you expect (1 2 1 2 1 2)?  If so, I guess you are looking for
splicing behavior in argument lists.  To implement this, we'd need to
convert things to CPS differently.  Instead of:

  (lambda (outer-k)
    (f^ 2 (lambda (y)
            (f^ 1 (lambda (x)
                    (f^ 3 (lambda (z)
                            (list^ x y z outer-k))))))))

You could do something like this:

  (lambda (outer-k)
    (f^ 2 (lambda ys
            (f^ 1 (lambda xs
                    (f^ 3 (lambda zs
                            (append^ xs ys zs
                                     (lambda (args)
                                       (apply^ list^ args outer-k))))))))))

I know that there are some languages that do this kind of splicing, but
in my opinion it's a very bad idea.  In Scheme, when you see the
procedure call:

  (f (x) (y))

You know just from this, without any context, that (x) will be the first
argument and (y) will be the second argument.  This fact is important
for both humans and compilers.  For humans it is important because we
can more easily and reliably reason about the code we are looking at.

For compilers it is important for the same reason, actually.  For
example, if 'f' is known at compile time, and something useful is known
about (y), we can inline the call to 'f', substituting (y) for its
second argument, and use our knowledge of (y) to optimize the result.

However, if we had splicing semantics for procedure calls, and if 'x' is
not known at compile time (it usually isn't), then we would have _no_
idea which argument of 'f' (y) will be bound to, and we can't do much of
anything to optimize this.

So, I suppose you could consider it design choice to insist that in an
arbitrary procedure call

  (f (x) (y))

that we know statically (i.e. at compile time) that no matter how (x)
and (y) behave, two arguments will always be passed in this call, and
the first argument will be (x) and the second argument will be (y).

I should mention that if you consider examples where the procedure being
called is 'list' or '+' or something else which accepts an arbitrary
number of arguments and treats them all interchangably, then the
splicing behavior seems more sensible.  However, such procedures are
quite rare in practice.

The overwhelmingly common case is that a procedure's arguments have very
different roles, where splicing is not at all useful and merely makes it
difficult for humans (and usually impossible for the compiler) to know
which argument a given operand will be assigned to.

Does that make sense?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]