emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: elisp-benchmarks


From: Mattias Engdegård
Subject: Re: elisp-benchmarks
Date: Thu, 10 Feb 2022 13:12:18 +0100

9 feb. 2022 kl. 23.19 skrev Stefan Monnier <monnier@iro.umontreal.ca>:

> And we see that Matthias's recent improvements to the bytecode
> interpreter do make a quite significant difference on several of those
> microbenchmarks ;-), and also on the bytecompiler benchmark (offsetting
> the extra work needed for the symbol-with-positions) tho they don't make
> much of a difference when it comes to scrolling(with-jit-lock) or when
> it comes to reindenting code with SMIE :-)

You are much too kind; my own macro-benchmarks indicate that we haven't quite 
reached status quo ante yet.

However I would like to caution against using the 'total' line of these 
benchmarks and in fact suggest that it be removed entirely since it can be very 
misleading: it is in effect tantamount to a completely arbitrary weighting of 
the individual benchmarks.

If we want an aggregate number that weighs all benchmark components equally, 
one way is to first establish a baseline, use relative changes to that 
baseline, and take the geometric average of those.

But that only makes sense if we value each component equally and there is no 
reason to do that -- many of the benchmarks measure essentially the same thing, 
and the mixture cannot in any way be defended as representative of any kind of 
practical Emacs use.

Individual benchmarks can of course be of interest: a proper presentation would 
be in a table where they each can be compared across different changes. 
Rearranging your date and normalising to before sympos, separately for timings 
that exclude and include GC, gives:

|                    |      ex gc      |     inc gc      |
| test               | sympos | master | sympos | master |
|--------------------+--------+--------+--------+--------|
| bubble             |   0.95 |   0.74 |   1.00 |   0.91 |
| bubble-no-cons     |   1.01 |   0.83 |   1.01 |   0.83 |
| bytecomp           |   1.06 |   0.99 |   1.06 |   1.02 |
| dhrystone          |   1.03 |   0.80 |   1.03 |   0.80 |
| eieio              |   0.99 |   0.83 |   0.99 |   0.90 |
| fibn               |   0.99 |   0.69 |   0.99 |   0.69 |
| fibn-named-let     |   0.98 |   0.68 |   0.98 |   0.68 |
| fibn-rec           |   1.01 |   0.66 |   1.01 |   0.66 |
| fibn-tc            |   1.02 |   0.60 |   1.02 |   0.60 |
| flet               |   1.06 |   0.68 |   1.06 |   0.68 |
| inclist            |   1.07 |   0.98 |   1.07 |   0.98 |
| inclist-type-hints |   1.07 |   0.98 |   1.07 |   0.98 |
| listlen-tc         |   1.00 |   0.73 |   1.00 |   0.73 |
| map-closure        |   1.02 |   0.69 |   1.02 |   0.69 |
| nbody              |   0.97 |   0.85 |   0.99 |   0.97 |
| pack-unpack        |   0.99 |   0.80 |   0.99 |   0.90 |
| pack-unpack-old    |   1.01 |   0.91 |   1.01 |   0.94 |
| pcase              |   0.87 |   0.86 |   0.87 |   0.86 |
| pidigits           |   0.93 |   0.86 |   0.96 |   0.92 |
| scroll             |   0.97 |   1.04 |   0.97 |   1.04 |
| smie               |   1.02 |   1.01 |   1.01 |   1.01 |

which is a bit more informative (at least if you know what the benchmarks do, 
but otherwise it's all nonsense numbers anyway).

For the record, my own Relint benchmark (we all have our pets!) is at about 
1.03 from the same baseline which is notable because it exercises a wide 
variety of operations, many of which weren't affected by any of the changes.

> The >10% slowdown recently seen on the test suite is still a mystery
> waiting for someone to figure out what's going on.

A config option to switch off sympos would be handy.

> BTW, I think one thing is clear when I look at those benchmarks:
> Emacs's GC is not good enough.

Indeed it's the elephant in the room. In addition, designing meaningful 
benchmarks that aren't GC-dominated can be tricky.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]