[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] speeding up Babel Gnuplot

From: Thierry Banel
Subject: Re: [O] speeding up Babel Gnuplot
Date: Wed, 04 Jan 2017 00:06:40 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0

Le 03/01/2017 22:55, Nicolas Goaziou a écrit :
> Hello,
> Thierry Banel <address@hidden> writes:
>> Here is a patch to avoid generating temporary files multiple times.
>> There is no way to ensure a single call to
>> (org-babel-gnuplot-process-vars) without modifying ob-core.el. I don't
>> want to do that because I would have to change a lot of babel backends.
>> Thus, I come back to my first light patch.
>> A 'param' list is passed around. It reflects the #+BEGIN_SRC header. My
>> patch changes it in-place from:
>>   (((:var data (3000) (2999) (2998) (2997) ...
>> to:
>>   (((:var data . "/tmp/babel-16991kSr/gnuplot-16991YBq") ...
>> The 'param' list behaves as a cache. There is nothing wrong with that.
>> The worst thing that can happen is the caching no longer working in case
>> 'param' would be copied some day. Results would stay correct.
> Thank you.
> What is the benefit of this patch? I mean,
> `org-babel-gnuplot-process-vars' is already quite fast here. Do you have
> some benchmark for that?
The benefit is Babel Gnuplot running twice as fast on large Org tables
(thousands of rows). On small tables there is no real benefit. Two
temporary files are left over in /tmp. They have identical content: data
suitably formatted for Gnuplot. Creating such large temp files
out-weights any other Babel processing.

Granted, you have already speeded-up `org-babel-gnuplot-process-vars'
quite a lot by reworking `org-export-table-row-number'. Now, going down
from 4 seconds to 2 seconds on a 5000 rows table (on my computer) is
quite pleasant.

My patch is very light: `org-babel-gnuplot-process-vars' is the only
modified function, and the change involves only adding a (setcdr)
instruction to cache the result of a heavy processing.

>>      (car pair) ;; variable name
>> -    (let* ((val (cdr pair)) ;; variable value
>> -           (lp  (listp val)))
>> -      (if lp
>> +    (let ((val (cdr pair))) ;; variable value
>> +      (if (not (listp val))
>> +          val
>> +        (let ((temp-file (org-babel-temp-file "gnuplot-"))
>> +              (first  (car val)))
>> +          (setcdr pair temp-file) ;; <------ caching here
> It would be nice to expunge the comment a bit.

Yes sure. If the patch is accepted, I'll clean it.

> Another option would be to generate a file according to the hash of
> contents so `org-babel-gnuplot-process-vars' knows when to create a new
> file.
Your proposal provides an additional benefit: caching file generation
between several invocations of Babel. (The cache in my patch is intended
to be used within a single Babel invocation, and is then garbage
collected.). The drawback is that we need to go through all rows of the
table, compute the hash, just to discover that the hash was already
known. The purpose of the cache was precisely to avoid going through the
table again.

Your proposal involves substantial work. We may also want to extend it
to all other Babel backends (R, shell, C, etc.). I may help if enough
users need it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]