[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#51982: Erroneous handling of local variables in byte-compiled nested
From: |
Mattias Engdegård |
Subject: |
bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas |
Date: |
Wed, 1 Dec 2021 17:04:44 +0100 |
30 nov. 2021 kl. 23.41 skrev Stefan Monnier <monnier@iro.umontreal.ca>:
> [ We could also force dynamically-scoped code to go through (a neutered
> version of) cconv.el , so that bytecomp.el and byte-opt.el can presume
> that `let*` doesn't exist any more. ]
Yes, a dynbind frontend would be handy for other reasons (some syntactic
normalisation in case we can't do in macroexpand-all).
> BTW, have you checked the impact on byte-code quality?
With respect to these patches? Yes: the B patch gives slightly better code
because materialising the accessor (internal-get-closed-var N) is as cheap or
cheaper than even a stack variable access. But the difference is small and
since the case is rare it's probably insignificant.
In fact, there is probably a way of making them produce identical code by
constant-propagating such forms in the optimiser. Who knows, might give
unexpected improvements to existing code as well. Time for an experiment!
>>> These two tests are identical aren't they?
>> No, they exercise different code paths (let and let*).
>
> Then that deserves a comment ;-)
Will do.
>>> Looks good (better than patch A).
>>
>> And here I was prepared to apply patch A since it's slightly more
>> conservative and it seems to be a rare problem anyway.
>> I've now split the patches in a more sensible (and easily reviewed) way: the
>> first corresponds to patch A, and the second is the diff to B. Take a second
>> look before making up your mind.
>>
>>> You say "On the other hand, patch B does abuse the cconv data structures
>>> a little (but it works!)" so the code should say something about
>>> this abuse. A least I failed to see where the abuse lies.
>>
>> There are comments and doc strings such as
>>
>> EXTEND is a list of variables which might need to be accessed even
>> from places where they are shadowed, because some part of ENV causes
>> them to be used at places where they originally did not
>> directly appear.
>>
>> but with the B patch we put things into `extend` that are not strictly
>> variables but (international-get-closed-var N).
>
> See below, I think we don't need to put them there.
>
>> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 ..))
>> where the ARGi are always treated as variables but now they can be access
>> forms as well.
>
> I don't think the current code assumes that ARGs are vars here.
> You're probably right that it used to be the case and it's not any more,
> but that shouldn't cause problems. The risk I can see is if one of
> those ARGs is an expression which refers to a var which gets shadowed,
> in which case `cconv--remap-llv` won't rewrite it the way it should.
> But I think with your code ARG will either be a simple var or something
> of the form (internal-get-closed-var N) so we should be safe.
>
>> @@ -304,6 +304,22 @@ cconv--convert-funcbody
>> `(,@(nreverse special-forms) ,@(macroexp-unprogn body))))
>> funcbody)))
>>
>> +(defun cconv--lifted-arg (var env)
>> + "The argument to use for VAR in λ-lifted calls according to ENV."
>> + (let ((mapping (cdr (assq var env))))
>> + (pcase-exhaustive mapping
>> + (`(internal-get-closed-var . ,_)
>> + ;; The variable is captured.
>> + mapping)
>> + (`(car-safe (internal-get-closed-var . ,_))
>> + ;; The variable is mutably captured; skip
>> + ;; the indirection step because the variable is
>> + ;; passed "by reference" to the λ-lifted function.
>> + (cadr mapping))
>> + ((or '() `(car-safe ,(pred symbolp)))
>> + ;; The variable is not captured; use the (shadowed) variable value.
>> + var))))
>
> The docstring or comment at the beginning should mention this function
> is specifically for shadowed vars.
Right.
> Also, If mapping is of the form (car-safe SYMBOL) is `var` really the
> correct answer? Shouldn't it still be (cadr mapping)?
Can there ever be a difference? I don't think so, but prove me wrong!
(If you manage to do that, you will have found a second bug in the original
code.)
For context, this is the case when we have a variable mutated by a lambda
lifted inner function (that doesn't escape). The variable will be wrapped in a
cons but retain its name. Example:
(lambda (x)
(let ((f (lambda () (setq x (1+ x)))))
(let ((x 3))
(list x (funcall f)))))
->
(lambda (x)
(let ((x (list x)))
(let ((f (lambda (x) (setcar x (1+ (car-safe x))))))
(let ((x 3)
(closed-x x))
(list x (funcall f closed-x))))))
> Side note: I don't understand why we `(cons closedsym`, since that
> `closedsym` can never appear in another binding (since it's fresh).
Maybe it's to satisfy the invariant checked by the assertion at the top?
> I don't much like this `symbolp` test (which fundamentally seems to
> be trying to recover the information about which branch of the `pcase`
> we're coming from in `cconv--lifted-arg`).
That's precisely what it is trying to do and no, I don't like it much either.
I suppose cconv--lifted-arg could be made a location function; we could then
access and mutate local variables. Something poetically self-referential about
that, but I'm not overly fond of the closure creation overhead (better than
what it once was but still too high).
> It at least deserves
> a comment explaining why it's doing the right thing.
> If we can remove this `symbolp` test recovering info about provenance of
> the result of `cconv--lifted-arg` then I think option B is better, but
> I prefer otherwise option A.
I don't see any alternative that is obviously better so I'm applying patch A.
We can still go with B later on if we want; the changes are minor.
Good comments, thank you very much!
- bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas,
Mattias Engdegård <=