emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compiling Elisp to a native code with a GCC plugin


From: Stephen J. Turnbull
Subject: Re: Compiling Elisp to a native code with a GCC plugin
Date: Sat, 18 Sep 2010 02:40:20 +0900

Lars Magne Ingebrigtsen writes:

 > I think that's a rather ... stunning approach, but might explain
 > many things about XEmacs.

I rather suspect it does. ;-)

 > I've presented a use case, and I've demonstrated how all the
 > alternative implementations are 50-150% slower than my suggested
 > new implementation, and I've done the implementation, which turned
 > out to be totally trivial.

Nothing that deals with Emacs strings or buffers is totally trivial,
as a general principle.

Then, again, it looks like David has discovered at least one bug
(texts with different values of multibyteness), maybe more (bounds
checking and integral type confusion), in your "totally trivial"
implementation already.

 > What more do you need?

First, I'm curious which machine and what data (buffer) you're using
that took 9 seconds to run that benchmark.  When I ran the benchmark
(environment described below) using XEmacs on a 1.8MHz Opteron machine
(quad core, but XEmacs can't take advantage of it) I discovered that
10,000 iterations takes ~300ms uncompiled, ~200ms compiled, and ~150ms
compiled and with the call to `search-forward' in question replaced
with a call to `ignore'.  I don't see a win here unless you're really
calling that function 10,000 times or more, and in a very tight loop.

So, is it really Gnus's habit to execute that form 10,000 times in a
loop so that its execution time dominates the user's perceived lag
time?  I bet that most uses involve parsing 20-40 RFC 822-style
headers, and the rest parse lines lazily.  If so, even the reported 9
second benchmark really amounts to a total of 50-100ms, which is less
than the "just noticable difference" for a fast human.

***** You can stop reading unless you really want to know the details. *****

Specifically, profiling 10000 iterations in a 997-byte buffer
containing three instances of "^(defun " (none at BOB) returned almost
faster than I can detect (288ms), based on

(defun is-the-string-following-point-equal-to-this-string-p (s)
  (search-forward s (+ (point) (length s)) t))

(defun test ()
  (while (or (is-the-string-following-point-equal-to-this-string-p "(defun ")
             (search-forward "\n(defun " nil t))
    (forward-line 1)))

(profile-expression '(let ((i 10000))
                        (while (> i 0)
                          (goto-char (point-min))
                          (test)
                          (setq i (1- i)))))

giving the profiling results below (note, all functions defined above
are *uncompiled*).  (Note that the total in the Ticks column may not
be the sum of the Ticks; this apparently has to do with reentering the
profiler and Ben didn't think it was worth fixing it.)

Function Name      Ticks/Total %Usage Calls GC-Usage/  Total
========================/===== ====== ===== ========/=======
search-forward       121/  121 42.160 80000                 
(profile overhead)   101/  101 35.192                       
is-the-string-following-point-equal-to-this-string-p
                      24/  129  8.362 40000                 
while                 11/  288  3.833 10001                 
or                     8/  231  2.787 40000                 
+                      6/    6  2.091 40000                 
forward-line           5/    5  1.742 30000                 
setq                   5/    6  1.742 10000                 
point                  2/    2  0.697 40000                 
test                   2/  262  0.697 10000                 
length                 1/    1  0.348 40000                 
>                      1/    1  0.348 10001                 
let                    0/  288  0.000     1                 
point-min              0/    0  0.000 10000                 
1-                     0/    0  0.000 10000                 
goto-char              0/    0  0.000 10000                 
------------------------------------------------------------
Total                288      100.000 380005        0


Ticks/Total     = Ticks this function/this function and descendants
Calls           = Number of calls to this function
GC-Usage/Total  = Lisp allocation this function/this function and descendants
One tick        = 1 ms

Compiled (including the profiling expression, that's `test1'), the
result is

Function Name      Ticks/Total %Usage Calls GC-Usage/  Total
========================/===== ====== ===== ========/=======
search-forward       112/  112 59.574 80000                 
(profile overhead)    42/   42 22.340                       
is-the-string-following-point-equal-to-this-string-p
                      16/   85  8.511 40000                 
test                  13/  181  6.915 10000                 
test1                  5/  190  2.660     1                 
------------------------------------------------------------
Total                190      100.000 130003        0


Ticks/Total     = Ticks this function/this function and descendants
Calls           = Number of calls to this function
GC-Usage/Total  = Lisp allocation this function/this function and descendants
One tick        = 1 ms

And here's the result with the search-forward in the predicate
replaced with ignore (which apparently conses because of the &rest
argument, and I'm not sure why the number is so huge):

Function Name      Ticks/Total %Usage Calls GC-Usage/  Total
========================/===== ====== ===== ========/=======
search-forward        74/   74 44.578 40000                 
(profile overhead)    32/   32 19.277                       
test                  24/  158 14.458 10000        0/3840000
ignore                19/   22 11.446 40000  3840000/3840000
is-the-string-following-point-equal-to-this-string-p
                      11/   46  6.627 40000        0/3840000
test1                  6/  166  3.614     1        0/3840000
------------------------------------------------------------
Total                  0      100.000 130003  3840000


Ticks/Total     = Ticks this function/this function and descendants
Calls           = Number of calls to this function
GC-Usage/Total  = Lisp allocation this function/this function and descendants
One tick        = 1 ms



reply via email to

[Prev in Thread] Current Thread [Next in Thread]