[ANN] faster org-table-to-lisp

From: tbanelwebmin
Subject: [ANN] faster org-table-to-lisp
Date: Thu, 30 Apr 2020 08:34:32 +0200
Hi The List.

Here is an alternative, faster version of org-table-to-lisp. It can be
more than 100 times faster.

#+BEGIN_SRC elisp
(defun org-table-to-lisp-faster (&optional org-table-at-p-done)
  "Convert the table at point to a Lisp structure.
The structure will be a list.  Each item is either the symbol `hline'
for a horizontal separator line, or a list of field values as strings.
The table is taken from the buffer at point.
When the optional ORG-TABLE-AT-P-DONE parameter is not nil, it is
assumed that (org-at-table-p) was already called."
  (or org-table-at-p-done (org-at-table-p) (user-error "No table at point"))
    (goto-char (org-table-begin))
    (let ((end (org-table-end))
      (while (< (point) end)
        (setq row nil)
        (search-forward "|" end)
        (if (looking-at "-")
              (search-forward "\n" end)
              (push 'hline table))
          (while (not (search-forward-regexp "\\=\n" end t))
            (unless (search-forward-regexp "\\=\\s-*\\([^|]*\\)" end t)
              (user-error "Malformed table at char %s" (point)))
            (let ((b (match-beginning 1))
          (e (match-end       1)))
              (and (search-backward-regexp "[^ \t]" b t)
               (forward-char 1))
           (buffer-substring-no-properties b (point))
          (goto-char (1+ e))))
          (push (nreverse row) table)))
      (nreverse table))))

Bellow is an example of a large table borrowed from the Datamash
software. On my PC, the reproducible benches show:
- Traditional org-table-to-lisp: 130 seconds
- Alternative org-table-to-lisp: 0.8 seconds (not compiled)

It is faster because it operates directly on the buffer with
(search-forward-regexp). Whereas the standard function splits a string
extracted from the buffer.

This function is a drop-in replacement for the standard one. It can
benefit to Babel and Gnuplot.

Would it make sense to upgrade Org Mode code base?

Beware! The optional parameter has a slightly different meaning for both
- for the traditional function, it is a string representing an Org table
- for the alternative function, it is a Boolean telling whether
(org-table-at-p) has been called or not

This difference makes no difference for the use cases in the code base.
The function is always called without a parameter, or as:

#+BEGIN_SRC elisp

Here is the reproducible bench. It is a self-contained, Org Mode file to
be opened in Emacs.
wget http://tbanelwebmin.free.fr/OrgMode/bench-org-table-to-lisp.org.gz

