[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH, RFC] Macros expansion changes, robust symbol macros, and con

From: Stefan Monnier
Subject: Re: [PATCH, RFC] Macros expansion changes, robust symbol macros, and constant propagation
Date: Thu, 19 Sep 2013 18:30:31 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

>> I think a good measurement is "load a whole bunch of *.el files", which
>> will spend time in eager macro-expansion (as well as a bit of `read' and
>> a bit of `eval', plus a bit of coding-system decode which could be
>> optimized away, ideally, but isn't yet).  30% slowdown for such
>> a test-case would be definitely considered "too costly".
> Let's see how bad it is in the real world.  I'm confident I can make the
> thing perform as well as the old version in the common case.  Worse comes
> to worst, we can keep the C macroexpand implementation and make
> macroexpand-1 a separate code path.

I don't see any reason why it couldn't be about as fast, indeed.

> Don't forget that constant propagation might provide significant
> offsetting speedups.

I'm not holding my breath.

> I meant "harmless" in the sense that a macroexpand on a symbol-macrolet
> form always produces a fully macroexpanded form, and that this expansion
> incorporates all the expansions of its children. Repeatedly
> macroexpanding this form is safe. What isn't safe is macroexpand-alling
> the result of calling macroexpand-all in a non-null symbol macro
> environment. In this case, yes, you can produce an incorrect result,
> just as the default macro expansion strategy does.

> But you'd have to do that on purpose: a macro is either going to rely on
> the default macro expansion strategy, which works fine, or it's going to
> return something like (cons :dontexpand (macroexpand-all (some-transform
> macro-arguments) (append something macro-environment)), which is also
> safe. You'd have to try in order to make repeated expansion a problem.
> What would be the point of having macroexpand-all return a form with the
> :dontexpand tag still attached when its caller is either going to be
> another macro, which will add its own tag, or a top-level caller, which
> doesn't want to see the tag?

I think I'm beginning to start understanding a little bit of the
reasoning above.

I don't really have an answer to your question.  But the kind of
scenario I'm having trouble with (both because I can't quite imagine it
and because I can't figure out whether things would still work right or
not) is one where a macro ends up returning fully-macroexpanded code
without knowing it.

Or a more concrete one: a macro doesn't know about you shiny new
argument and :dontexpand tag, calls macroexpand-all and returns the
result without adding the :dontexpand tag to it.

If the :dontexpand tag is only here for perfomance reasons, there's no
problem at all and I don't need to convince myself that it will always
work.  But if we want to rely on it for correctness, the correctness
argument needs to be clarified.

>> I'm not convinced symbol-macro is a good way to perform
>> constant propagation.  See my comment below.
> Why not?

For the algorithmic complexity reasons I mentioned (see also below).

> There's another problem: cconv analysis happens _once_, in
> byte-compile-preprocess.

You can re-run it, of course.  But I agree that it can be problematic to
use (e.g. it imposes strong constraints on the order you traverse the
code afterwards).

> The point of this project is to generate better bytecode by letting
> Emacs make more decisions at compile time. I'm willing to pay for
> runtime speedup with an increase in compilation time.

I think we'd want to beef up (or rewrite) byte-opt.el for that: e.g. add
a `context' to the recursive calls, which carries info about the
variables in scope.  That can be used for constant-propagation as well
as various other optimizations (e.g. replace (car x) with y when x was
defined as (cons y ...)).

>> That is true.  But there are fairly good performance reasons for it.
> Sure --- but we're talking about running constant propagation only
> during byte compilation.  cconv has to run all the time.

No, cconv is only run during byte-compilation.

>>> We'll be using it for other tasks too.  Also, any unsafe code
>>> transformation is a bug.
>> Yes, but I'm not convinced it's really unsafe.  It is sometimes less
>> aggressive than it could, but that's not a problem.
> It depends on the application. The function's definition doesn't say
> anything about it being a conservative analysis.

The naive analysis it performs is naturally conservative, except for the
fact that it might miss a variable that only shows up after
macro-expansion (possible, in theory, but I haven't seem it in
practice).  So any bug can be fixed by simply changing that function to
call macroexpand-all.

So, yes, if the analysis is here to stay, there's no reason not to use
it, but this use doesn't justify such a large piece of code.

>>>>> - changes cl-defsubst so that it works like regular defsubst
>>>> Doesn't this break setf on defstruct slots?
>>> Yes, but I fixed it by restoring the old explicit-accessor code.  Having
>>> gv depend on compiler macros is scary.
>> But it's efficient.  A single symbol property provides both features at
>> the same time.
> Symbol properties aren't particularly expensive.

`cl-defstruct' expands to a crapload of code.  Less is better here.

> "Common Lisp does X, so we should too" is a good argument, and you can't
> dismiss it by just saying that Emacs is not Common Lisp,

I did not *just* say that.  I argued before for why this ship has
sailed already, hence pointing out that in practice, this CommonLisp
principle is not followed in EmacsLisp.

>> Yuck!  The above is already pretty bad, and adding
>> macroexp-analyze-free-variables will make it even worse: optimizing
>> a let according to the above code takes time O(size(exp)).  Plus another
>> O(size(exp)) for macroexp-analyze-free-variables (actually, this one can
>> be even O(size(exp)^2) or so because at each step it can manipulate
>> data-structures that can grow with size(exp)).  And this for every
>> nesting level, so you get a serious O(N^2) behavior overall, sliding
>> towards O(N^3), just to optimize away some const bindings.
> Yes, we need to walk each form to figure out which bindings we can
> safely eliminate, and yes, we need to walk the code again to do the
> replacements. We need to do that for each let form. But your analysis is
> too conservative: the data structure manipulation is going to be cheap
> in practice because most forms aren't made chiefly out of free variable
> references. It might also be possible to memoize the results so that
> total running time becomes linear.

"It might be possible", yes.  Until you do that, this code can't be
integrated, because O(N^2) or worse behaviors like that *will* bite us
sooner or later, even if we measured no slowdown in out tests.

There is no need for such algorithmic complexity just to perform
constant propagation.

> Time isn't a problem in practice.

Oh yes, it is.  Compiling Emacs takes a godawful amount of time for me.

> Even the naively-implemented,
> un-optimized version of macroexp-analyze-free-variables  processes the
> macro-expanded version of cconv-analyze-form (a very large function) in
> 1.6ms.

Sure.  But multiply that with calling this function once per let
(i.e. repeatedly for the same piece of nested code), and the figure can
become a lot more significant.

> The point of compilation is to spend some CPU time now so that you spend
> less CPU time later. Why does compilation need to be fast?

Why does it have to be O(N^2) when it can be O(N)?

>> The "immutable" info is already computed by cconv, so you could instead
>> change cconv to output code where the immutable let bindings
>> are annotated.
> And rerun cconv after each optimization pass?

[ You mean cconv's analysis, not the whole cconv, obviously.  ] If you
want to apply optimizations repeatedly (which is indeed a good idea),
then let's do it right: have an O(N) analysis pass over the whole code,
then a O(N) optimization pass over the whole pass, and then loop over
them.  As opposed to your code which performs analysis of
sub-expressions in the middle of an optimization pass (which is the
problem that leads to the O(N^2) behavior).

>>>>> (push (list temp) cl--loop-bindings)
>>>>> (setq form `(if (setq ,temp ,cond)
>>>>> -                                ,@(cl-subst temp 'it form))))
>>>>> +                                (symbol-macrolet-shadowable ((it ,temp))
>>>>> +                                  ,@form))))
>>>> Why not (setq form `(let ((it ,cond)) (when it ,@form)))?
>>> In dynamically bound code, we want the `it' symbol to be lexically
>>> scoped.
>> I don't think anyone really cares about that.
> I don't understand why you're opposed to rewriting `it' here.  It's not
> as if the `let' solution is much smaller.

For one, I'm not 100% sure that symbol-macrolet-shadowable provides the
exactly right semantics and is implemented 100% correctly.  OTOH, I have
complete trust in `let'.

Also because we want to use `let' here *in any case*.  It's a good idea
(which we should install right away) even if we don't end up performing

BTW, what happens if `form' does a `setq' on `it'?

> I don't think it's either simpler or cleaner.

The current cl-symbol-macrolet is not simple nor clean.  Your code might
make it more robust and clean up some of its semantics.  It's still far
from simple, and we have little experience with it to know if
it's robust.

And of course, if `form' is large, this will take time O(N) again, which
again brings us an O(N^2) behavior, whereas if we use `let' and rely on
a subsequent phase to do the constant propagation we don't need to
suffer from such a cost.

>> Also it will be more efficient for the lexical-binding code (as
>> discussed recently, in lexical-binding, `let' is cheaper than `setq').
> Who said anything about setq?

The current code uses `setq'.

>> That's OK, since I still don't want symbol-macro in the core.
> I still don't understand why you're opposed to adding symbol-macrolet.

Because it's very rarely used so far.

> Yes, it's actually (function (lambda ...)). So what? Why not handle both
> cases?

Why handle a case that *can't* happen?

>>>>> +(defconst macroexpand-already-expanded
>>>>> +  (if (boundp 'macroexpand-already-expanded)
>>>>> +      (symbol-value 'macroexpand-already-expanded)
>>>>> +    (make-symbol "--macroexpand-already-expanded--"))
>>>> Again: why not a defvar?  Also, we could define it as an interned
>>>> well-known symbol.  E.g. macroexp--already-expanded.
>>> In this case, I think an interned well-known symbol will work.  I'd
>>> prefer making it a keyword symbol, though, in order to highlight its
>>> special effects.
>> I'd rather stay away from using keywords as valid functions.
> It needs to be a valid function in order ƒor the interpreter to work.

Exactly.  So just make it a normal symbol.  Just with the added
semantics that "you should only call it with a completely macro-expanded

> If you _save_ that value and then apply it to macroexpand later, it
> should work.  macroexpand-all-environment's dynamic scope is independent
> of the extent of that variable's value.

Right.  So we'd have to also save macroexpand-hook-functions.

>> What other uses (besides symbol-macrolet) do you have for that
>> macroexpand-environment-hook-tag?
> The labels code looks scary. I wonder whether it might benefit as well.

Are you talking about the labels or the cl-labels code?

Anyway, I suggest you try and split your patch into chunks.  E.g. "fix
symbol-macrolet not to rely on symbol's names being non-eq"; "introduce
macroexpand-1"; "move macroexpand to Lisp" [ These are things that
I would agree with right away. ], and then we can think again about
the rest.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]