emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Orgmode] [PATCH] sha1 hash of latex fragments to avoid regeneration


From: Carsten Dominik
Subject: Re: [Orgmode] [PATCH] sha1 hash of latex fragments to avoid regeneration
Date: Tue, 17 Nov 2009 14:14:15 +0100

Hi Eric,

looks great now, I have made a few minor changes and applied it.

- Carsten

On Nov 17, 2009, at 1:11 AM, Eric Schulte wrote:

Delivered-To: address@hidden
Received: by 10.90.33.18 with SMTP id g18cs184746agg;
        Mon, 16 Nov 2009 16:14:16 -0800 (PST)
Received: by 10.115.103.17 with SMTP id f17mr8915518wam. 166.1258416855542;
        Mon, 16 Nov 2009 16:14:15 -0800 (PST)
Return-Path: <address@hidden>
Received: from mail-pz0-f194.google.com (mail-pz0-f194.google.com [209.85.222.194]) by mx.google.com with ESMTP id 32si16386502pzk. 110.2009.11.16.16.14.14;
        Mon, 16 Nov 2009 16:14:14 -0800 (PST)
Received-SPF: pass (google.com: domain of address@hidden designates 209.85.222.194 as permitted sender) client- ip=209.85.222.194; Authentication-Results: mx.google.com; spf=pass (google.com: domain of address@hidden designates 209.85.222.194 as permitted sender) address@hidden; dkim=pass (test mode) address@hidden
Received: by mail-pz0-f194.google.com with SMTP id 32so4183666pzk.21
for <address@hidden>; Mon, 16 Nov 2009 16:14:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
h=domainkey- signature:received:received:from:to:cc:subject:date
         :references:message-id:user-agent:mime-version:content-type;
        bh=ljxQthw1QhSpvXDshKqdl0Hmi2nMWV522o9FyWQIilY=;
b=JWQ//xkTNTI4hck3U/DCNEnBYADht03DHIfHIpu/O3sUVCX7vECFDVV/ YiboCVdziZ R4Uy6vQO2/PIB +m5VhNXtx9xQoVrZMZCkfsoNjXtg5iWUzvKPon0sP9Hu7x7iC48+bc3
         nHT82nwLQxD8AfjPRnrHWxVJE0V6PeFBl2zrk=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=from:to:cc:subject:date:references:message-id:user-agent
         :mime-version:content-type;
b=PsUXGek+vgAXULkt/6iP9BZQVaBqpCb8cB8bPp8suG4lT2ZAdTHti3K/ QKt3ZKlUrp uVYHXPt1lustTNapWXvGPCK269E9xLkzU0fiFtyE8InqF +tOn86drUHSbDmSFC5hh3uJ
         sXgMAXWAMMe7J1y89K1H/NdV61cXAm/AOclC4=
Received: by 10.115.101.18 with SMTP id d18mr8602604wam. 191.1258416853669;
        Mon, 16 Nov 2009 16:14:13 -0800 (PST)
Return-Path: <address@hidden>
Received: from eschulte (adaptive.cs.unm.edu [64.106.21.179])
by mx.google.com with ESMTPS id 23sm1871553pxi. 1.2009.11.16.16.14.11
        (version=TLSv1/SSLv3 cipher=RC4-MD5);
        Mon, 16 Nov 2009 16:14:12 -0800 (PST)
From: "Eric Schulte" <address@hidden>
To: Carsten Dominik <address@hidden>
Cc: Org Mode <address@hidden>
Subject: Re: [Orgmode] [PATCH] sha1 hash of latex fragments to avoid regeneration
Date: Mon, 16 Nov 2009 17:11:03 -0700
References: <address@hidden>
        <address@hidden>
Message-ID: <address@hidden>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (darwin)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="

--=-=-=

Hi Carsten,

Thanks for the feedback, I have comments inline below

Carsten Dominik <address@hidden> writes:

> Hi Eric,
>
> this is fantastic, thank you for implementing it. I have wanted some
> speedup
> for this for a long time.
>
> I think your implementation still suffers from one issue:
>
> The produced image also depends on the variables org-format-latex-
> options,
> org-format-latex-header, org-export-latex-package-alist,
> and on the `forbuffer' flag (because images made for display in
> the buffer and fo HTML export generaly need different resolution).
>
> One way to deal with this would be to make a list containing the values > of these four variables and using prin1-to-string to convert this list
> into a string, and then to prepend this string to TXT when creating
> the hash.
>

That sounds like the best solution.  I have made this change in the
newly attached patch.

>
> Now, I am sure that you are already planning to do the same
> for ditaa images etc?

of course :)

> That would be a treat, because ditaa can be terribly slow for complex
> figures, and this would speed up the cycle when writing document by
> quite a bit.
>

Dan and I have been working on general caching solution for org-babel.
Once we get that sorted out it should provide for the caching of all
org-babel results which would include ditaa, dot, gnuplot, etc...

I am currently more interested in making these changes in org-babel than in org-exp-blocks, but in this case it may be worth implementing caching
in both cases.

>
> There is one further issue:  Cleaning up images that are no
> longer used.
>
> With the LaTeX fragments it is not a big problem, because there
> live in a special directory.  This would be a bigger concern for
> ditaa images etc which tend to live in the same directory as the
> source.  Maybe that could be solved by
>
> 1. Making sure that each image still have a name like "blue", so
>    that the name now would be "blus_loooooonghashvalue.png" or so.
> 2. Maybe creating a command that will look for orphaned images
>    and remove them, by looking for the hash in the name and
>    checking access times.  I am not sure if this is needed,
>    and not sure what would be the best way to implement it.
>

Yes, this will not be an issue in the org-babel implementation as the
hash key is stored separate from the file name, but I can see how this
would need to be considered for any org-exp-blocks hash-based image
caching. The first option you propose above sounds very doable, as long as we are comfortable removing any files that match regular expressions
like

  blue_[[:alnum:]]+\.png

which seems safe enough...

>
> After looking at these things, I would be *very* happy to accept
> this patch.
>

I'll give this some more thought, but perhaps the latex image fragment
patch is now viable.

Best -- Eric


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
 filename=0001-latex-fragment-images-cached-using-sha1-hash-keys.patch

From 0e9a359c1d5e8f67c20066533171fb1edc11ba61 Mon Sep 17 00:00:00 2001
From: Eric Schulte <address@hidden>
Date: Mon, 16 Nov 2009 16:53:34 -0700
Subject: [PATCH] latex fragment images cached using sha1 hash keys

  Latex fragment images are now saved in files named by the sha1 hash
  of the latex text used to create the image.  By checking if files
  exist before images generation the regeneration of identical latex
  images is avoided.
---
 lisp/ChangeLog |    6 ++++++
 lisp/org.el    |   22 +++++++++++-----------
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/lisp/ChangeLog b/lisp/ChangeLog
index 5f83aaa..b581931 100755
--- a/lisp/ChangeLog
+++ b/lisp/ChangeLog
@@ -1,3 +1,9 @@
+2009-11-17  Eric Schulte  <address@hidden>
+
+       * org.el (org-format-latex): Latex images are now saved to files
+       named by the sha1 hash of the latex source text avoiding
+       regeneration of identical images.
+
 2009-11-16  Carsten Dominik  <address@hidden>

        * org-html.el (org-export-html-home/up-format): Add an ID to the
diff --git a/lisp/org.el b/lisp/org.el
index bf6573b..15a8f9e 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -14550,15 +14550,9 @@ Some of the options can be changed using the variable
         (opt org-format-latex-options)
         (matchers (plist-get opt :matchers))
         (re-list org-latex-regexps)
-        (cnt 0) txt link beg end re e checkdir
+        (cnt 0) txt hash link beg end re e checkdir
         executables-checked
         m n block linkfile movefile ov)
- ;; Check if there are old images files with this prefix, and remove them
-    (when (file-directory-p todir)
-      (mapc 'delete-file
-           (directory-files
-            todir 'full
-            (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$"))))
     ;; Check the different regular expressions
     (while (setq e (pop re-list))
       (setq m (car e) re (nth 1 e) n (nth 2 e)
@@ -14576,9 +14570,14 @@ Some of the options can be changed using the variable
            (setq txt (match-string n)
                  beg (match-beginning n) end (match-end n)
                  cnt (1+ cnt)
-                 linkfile (format "%s_%04d.png" prefix cnt)
-                 movefile (format "%s_%04d.png" absprefix cnt)
                  link (concat block "[[file:" linkfile "]]" block))
+            (setq hash (sha1 (prin1-to-string
+                              (list org-format-latex-header
+ (if (boundp 'org-export-latex- package-alist) + org-export-latex-package- alist)
+                                    forbuffer txt)))
+                 linkfile (format "%s_%s.png" prefix hash)
+                 movefile (format "%s_%s.png" absprefix hash))
            (if msg (message msg cnt))
            (goto-char beg)
            (unless checkdir ; make sure the directory exists
@@ -14592,8 +14591,9 @@ Some of the options can be changed using the variable
               "dvipng" "needed to convert LaTeX fragments to images")
              (setq executables-checked t))

-           (org-create-formula-image
-            txt movefile opt forbuffer)
+            (unless (file-exists-p movefile)
+              (org-create-formula-image
+               txt movefile opt forbuffer))
            (if overlays
                (progn
                  (mapc (lambda (o)
--
1.6.4.73.gc144


--=-=-=


>
> - Carsten
>
> On Nov 16, 2009, at 1:07 AM, Eric Schulte wrote:
>
>> Hi,
>>
>> The attached patch changes the latex fragment image generation so that >> it saves images into files named by the sha1 hash of the latex source
>> code.  By checking for the existence of image files before image
>> generation the regeneration of identical images is avoided.
>>
>> In practice I find that this greatly speeds up export to html and the
>> `org-preview-latex-fragment' command.
>>
>> Cheers -- Eric
>>
>> From 13e1c48fa6cac43b0c87ca0fbc8e349f7a9fa864 Mon Sep 17 00:00:00 2001
>> From: Eric Schulte <address@hidden>
>> Date: Sun, 15 Nov 2009 17:00:09 -0700
>> Subject: [PATCH] latex fragment images cached using sha1 hash keys
>>
>> Latex fragment images are now saved in files named by the sha1 hash
>>  of the latex text used to create the image.  By checking if files
>>  exist before images generation the regeneration of identical latex
>>  images is avoided.
>> ---
>> lisp/ChangeLog |    6 ++++++
>> lisp/org.el    |   18 +++++++-----------
>> 2 files changed, 13 insertions(+), 11 deletions(-)
>>
>> diff --git a/lisp/ChangeLog b/lisp/ChangeLog
>> index 339f248..f18755c 100755
>> --- a/lisp/ChangeLog
>> +++ b/lisp/ChangeLog
>> @@ -1,3 +1,9 @@
>> +2009-11-16  Eric Schulte  <address@hidden>
>> +
>> +      * org.el (org-format-latex): Latex images are now saved to files
>> +      named by the sha1 hash of the latex source text avoiding
>> +      regeneration of identical images.
>> +
>> 2009-11-15  Carsten Dominik  <address@hidden>
>>
>>        * org-wl.el (org-wl-store-link): Handle the case that
>> diff --git a/lisp/org.el b/lisp/org.el
>> index bf6573b..46348fc 100644
>> --- a/lisp/org.el
>> +++ b/lisp/org.el
>> @@ -14550,15 +14550,9 @@ Some of the options can be changed using
>> the variable
>>         (opt org-format-latex-options)
>>         (matchers (plist-get opt :matchers))
>>         (re-list org-latex-regexps)
>> -       (cnt 0) txt link beg end re e checkdir
>> +       (cnt 0) txt hash link beg end re e checkdir
>>         executables-checked
>>         m n block linkfile movefile ov)
>> -    ;; Check if there are old images files with this prefix, and
>> remove them
>> -    (when (file-directory-p todir)
>> -      (mapc 'delete-file
>> -          (directory-files
>> -           todir 'full
>> -           (concat (regexp-quote prefixnodir) "_[0-9]+\\.png$"))))
>>     ;; Check the different regular expressions
>>     (while (setq e (pop re-list))
>>       (setq m (car e) re (nth 1 e) n (nth 2 e)
>> @@ -14576,9 +14570,10 @@ Some of the options can be changed using
>> the variable
>>            (setq txt (match-string n)
>>                  beg (match-beginning n) end (match-end n)
>>                  cnt (1+ cnt)
>> -                linkfile (format "%s_%04d.png" prefix cnt)
>> -                movefile (format "%s_%04d.png" absprefix cnt)
>>                  link (concat block "[[file:" linkfile "]]" block))
>> +            (setq hash (sha1 txt)
>> +                linkfile (format "%s_%s.png" prefix hash)
>> +                movefile (format "%s_%s.png" absprefix hash))
>>            (if msg (message msg cnt))
>>            (goto-char beg)
>>            (unless checkdir ; make sure the directory exists
>> @@ -14592,8 +14587,9 @@ Some of the options can be changed using the
>> variable
>>               "dvipng" "needed to convert LaTeX fragments to images")
>>              (setq executables-checked t))
>>
>> -          (org-create-formula-image
>> -           txt movefile opt forbuffer)
>> +            (unless (file-exists-p movefile)
>> +              (org-create-formula-image
>> +               txt movefile opt forbuffer))
>>            (if overlays
>>                (progn
>>                  (mapc (lambda (o)
>> --
>> 1.6.4.73.gc144
>>
>> _______________________________________________
>> Emacs-orgmode mailing list
>> Remember: use `Reply All' to send replies to the list.
>> address@hidden
>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
>
> - Carsten

--=-=-=--

- Carsten







reply via email to

[Prev in Thread] Current Thread [Next in Thread]