bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51982: Erroneous handling of local variables in byte-compiled nested


From: Mattias Engdegård
Subject: bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas
Date: Wed, 1 Dec 2021 17:04:44 +0100

30 nov. 2021 kl. 23.41 skrev Stefan Monnier <monnier@iro.umontreal.ca>:

> [ We could also force dynamically-scoped code to go through (a neutered
>  version of) cconv.el , so that bytecomp.el and byte-opt.el can presume
>  that `let*` doesn't exist any more.  ]

Yes, a dynbind frontend would be handy for other reasons (some syntactic 
normalisation in case we can't do in macroexpand-all). 

> BTW, have you checked the impact on byte-code quality?

With respect to these patches? Yes: the B patch gives slightly better code 
because materialising the accessor (internal-get-closed-var N) is as cheap or 
cheaper than even a stack variable access. But the difference is small and 
since the case is rare it's probably insignificant.

In fact, there is probably a way of making them produce identical code by 
constant-propagating such forms in the optimiser. Who knows, might give 
unexpected improvements to existing code as well. Time for an experiment!

>>> These two tests are identical aren't they?
>> No, they exercise different code paths (let and let*).
> 
> Then that deserves a comment ;-)

Will do.

>>> Looks good (better than patch A).
>> 
>> And here I was prepared to apply patch A since it's slightly more
>> conservative and it seems to be a rare problem anyway.
>> I've now split the patches in a more sensible (and easily reviewed) way: the
>> first corresponds to patch A, and the second is the diff to B. Take a second
>> look before making up your mind.
>> 
>>> You say "On the other hand, patch B does abuse the cconv data structures
>>> a little (but it works!)" so the code should say something about
>>> this abuse.  A least I failed to see where the abuse lies.
>> 
>> There are comments and doc strings such as
>> 
>>  EXTEND is a list of variables which might need to be accessed even
>>  from places where they are shadowed, because some part of ENV causes
>>  them to be used at places where they originally did not
>>  directly appear.
>> 
>> but with the B patch we put things into `extend` that are not strictly
>> variables but (international-get-closed-var N).
> 
> See below, I think we don't need to put them there.
> 
>> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 ..))
>> where the ARGi are always treated as variables but now they can be access
>> forms as well.
> 
> I don't think the current code assumes that ARGs are vars here.
> You're probably right that it used to be the case and it's not any more,
> but that shouldn't cause problems.  The risk I can see is if one of
> those ARGs is an expression which refers to a var which gets shadowed,
> in which case `cconv--remap-llv` won't rewrite it the way it should.
> But I think with your code ARG will either be a simple var or something
> of the form (internal-get-closed-var N) so we should be safe.
> 
>> @@ -304,6 +304,22 @@ cconv--convert-funcbody
>>             `(,@(nreverse special-forms) ,@(macroexp-unprogn body))))
>>       funcbody)))
>> 
>> +(defun cconv--lifted-arg (var env)
>> +  "The argument to use for VAR in λ-lifted calls according to ENV."
>> +  (let ((mapping (cdr (assq var env))))
>> +    (pcase-exhaustive mapping
>> +      (`(internal-get-closed-var . ,_)
>> +       ;; The variable is captured.
>> +       mapping)
>> +      (`(car-safe (internal-get-closed-var . ,_))
>> +       ;; The variable is mutably captured; skip
>> +       ;; the indirection step because the variable is
>> +       ;; passed "by reference" to the λ-lifted function.
>> +       (cadr mapping))
>> +      ((or '() `(car-safe ,(pred symbolp)))
>> +       ;; The variable is not captured; use the (shadowed) variable value.
>> +       var))))
> 
> The docstring or comment at the beginning should mention this function
> is specifically for shadowed vars.

Right.

> Also, If mapping is of the form (car-safe SYMBOL) is `var` really the
> correct answer?  Shouldn't it still be (cadr mapping)?

Can there ever be a difference? I don't think so, but prove me wrong!
(If you manage to do that, you will have found a second bug in the original 
code.)

For context, this is the case when we have a variable mutated by a lambda 
lifted inner function (that doesn't escape). The variable will be wrapped in a 
cons but retain its name. Example:

(lambda (x)
  (let ((f (lambda () (setq x (1+ x)))))
    (let ((x 3))
      (list x (funcall f)))))
->
(lambda (x)
  (let ((x (list x))) 
    (let ((f (lambda (x) (setcar x (1+ (car-safe x))))))
      (let ((x 3)
            (closed-x x))
        (list x (funcall f closed-x))))))

> Side note: I don't understand why we `(cons closedsym`, since that
> `closedsym` can never appear in another binding (since it's fresh).

Maybe it's to satisfy the invariant checked by the assertion at the top?

> I don't much like this `symbolp` test (which fundamentally seems to
> be trying to recover the information about which branch of the `pcase`
> we're coming from in `cconv--lifted-arg`).

That's precisely what it is trying to do and no, I don't like it much either.

I suppose cconv--lifted-arg could be made a location function; we could then 
access and mutate local variables. Something poetically self-referential about 
that, but I'm not overly fond of the closure creation overhead (better than 
what it once was but still too high).

>  It at least deserves
> a comment explaining why it's doing the right thing.

> If we can remove this `symbolp` test recovering info about provenance of
> the result of `cconv--lifted-arg` then I think option B is better, but
> I prefer otherwise option A.

I don't see any alternative that is obviously better so I'm applying patch A. 
We can still go with B later on if we want; the changes are minor.

Good comments, thank you very much!







reply via email to

[Prev in Thread] Current Thread [Next in Thread]