[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[O] [PATCH] Re: Problems with org publish cache checking

From: Matt Lundin
Subject: [O] [PATCH] Re: Problems with org publish cache checking
Date: Wed, 25 Nov 2015 20:30:54 -0600
User-agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.1.50 (gnu/linux)

Matt Lundin <address@hidden> writes:

> I've been doing some testing of org-publish functions and have found a
> few problems with org-publish-cache-file-needs-publishing. They arise
> from the fact that it attempts to take included files into account.

OK, I've worked up a patch that solves several of these issues. The
basic idea is to check when publishing an org file whether it includes
other org files and then to store that data in the cache. That way,
org-publish-cache-file-needs-publishing does not need to open each
buffer but rather can compare the stored timestamp data against the
actual modified times of the included files.

> Org-publish does not check the cache of included files at all. It
> simply compares the last modified time of an included file with the
> last modified time of the master/including file. The result is that a
> master file will perpetually be republished if an included file
> happened to be changed afterwards (even if both files were changed
> years ago and the project has been published 100s of times since
> then).

This patch fixes this by caching timestamps for included files, thus
allowing org-publish to track changes in included files.

> 3. It is slow!!! The function visits every file in a project to check
>    for #+INCLUDE declarations, thus offsetting much of the benefit of
>    caching timestamps. To test this, I created a dummy project with over
>    1000 pages (not typical usage, of course, but possible for someone
>    writing a blog over several years or creating a large interlinked
>    wiki).

This patch should make things much faster, since we only need to scan
for included files during publishing (when the buffer is already
active). Org-publish no longer has to visit each file individually
during publishing (which takes a lot of time); rather, it can just use
the cache.


>From 7a69052334416309802c861a7b6b72865c331a37 Mon Sep 17 00:00:00 2001
From: Matt Lundin <address@hidden>
Date: Wed, 25 Nov 2015 20:23:39 -0600
Subject: [PATCH] Speed up publishing by caching included file data

* lisp/ox-publish.el: (org-publish-cache-get-included-files): New function
  (org-publish-org-to): Use new function
  (org-publish-cache-file-needs-publishing): Use cache instead of
  visiting every file in a project.

Org-publish can now quickly determine a) whether an org source includes
other files and b) whether those files have changed. This speeds up the
publishing process and makes tracking of changes in included files more
 lisp/ox-publish.el | 62 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
index 90f307c..ba85c7e 100644
--- a/lisp/ox-publish.el
+++ b/lisp/ox-publish.el
@@ -574,6 +574,7 @@ Return output file name."
             (let ((output-file
                    (org-export-output-file-name extension nil pub-dir))
                   (body-p (plist-get plist :body-only)))
+              (when org-publish-cache (org-publish-cache-get-included-files))
               (org-export-to-file backend output-file
                 nil nil nil body-p
                 ;; Add `org-publish--collect-references' and
@@ -1227,36 +1228,41 @@ the file including them will be republished as well."
   (unless org-publish-cache
      "`org-publish-cache-file-needs-publishing' called, but no cache present"))
-  (let* ((case-fold-search t)
-        (key (org-publish-timestamp-filename filename pub-dir pub-func))
+  (let* ((key (org-publish-timestamp-filename filename pub-dir pub-func))
         (pstamp (org-publish-cache-get key))
-        (org-inhibit-startup t)
-        (visiting (find-buffer-visiting filename))
-        included-files-ctime buf)
-    (when (equal (file-name-extension filename) "org")
-      (setq buf (find-file (expand-file-name filename)))
-      (with-current-buffer buf
-       (goto-char (point-min))
-       (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
-         (let* ((element (org-element-at-point))
-                (included-file
-                 (and (eq (org-element-type element) 'keyword)
-                      (let ((value (org-element-property :value element)))
-                        (and value
-                             (string-match "^\\(\".+?\"\\|\\S-+\\)" value)
-                             ;; Ignore search suffix.
-                             (car (split-string
-                                   (org-remove-double-quotes
-                                    (match-string 1 value)))))))))
-           (when included-file
-             (push (org-publish-cache-ctime-of-src
-                    (expand-file-name included-file))
-                   included-files-ctime)))))
-      (unless visiting (kill-buffer buf)))
+        (ctime (when pstamp (org-publish-cache-ctime-of-src filename))))
     (or (null pstamp)
-       (let ((ctime (org-publish-cache-ctime-of-src filename)))
-         (or (< pstamp ctime)
-             (cl-some (lambda (ct) (< ctime ct)) included-files-ctime))))))
+       (< pstamp ctime)
+       (cl-some (lambda (incl)
+                  ;; See if cached time is before modification time.
+                  (< (cdr incl)
+                     (org-publish-cache-ctime-of-src (car incl))))
+         (org-publish-cache-get-file-property filename :includes)))))
+(defun org-publish-cache-get-included-files ()
+  "Get names and last modified times of included files in current buffer."
+  (let ((case-fold-search t)
+       included)
+    (save-excursion
+      (goto-char (point-min))
+      (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
+       (let* ((element (org-element-at-point))
+              (included-file
+               (and (eq (org-element-type element) 'keyword)
+                    (let ((value (org-element-property :value element)))
+                      (and value
+                           (string-match "^\\(\".+?\"\\|\\S-+\\)" value)
+                           ;; Ignore search suffix.
+                           (car (split-string
+                                 (org-remove-double-quotes
+                                  (match-string 1 value)))))))))
+         (when included-file
+           (let ((iname (expand-file-name included-file)))
+             (push (cons iname (org-publish-cache-ctime-of-src
+                                (expand-file-name iname)))
+                   included))))))
+    (org-publish-cache-set-file-property (buffer-file-name)
+                                        :includes included)))
 (defun org-publish-cache-set-file-property
   (filename property value &optional project-name)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]