bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer

From: Tino Calancha
Subject: bug#26338: 26.0.50; Collect all matches for REGEXP in current buffer
Date: Fri, 07 Apr 2017 23:47:16 +0900
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)

Juri Linkov <address@hidden> writes:

>>> Sorry if this was said already, but why a macro and not a map-like
>>> function?
>> No special reason.  It's the second idea which came to my mind after
>> my initial proposal was declined.  Maybe because is shorter to do:
>> (with-collect-matches regexp)
>> than
>> (foo-collect-matches regexp nil #'identity)
>> if you are just interested in the list of matches.  Implementing it as
>> a map function might be also nice.  Don't see a big enthusiasm on
>> the proposal, though :-(
>> So far people think that it's easy to write a while loop.  I wonder if they
>> think the same about the existence of `dolist': the should
>> never use it and always write a `while' loop instead.  Don't think they
>> do that anyway.
>> I will repeat it once more.  I find nice, having an operator returning
>> a list with matches for REGEXP.  If such operator, in addition, accepts
>> a body of code or a function, then i find this operator very nice
>> and elegant.
> A mapcar-like function presumes a lambda where you can process every
> match as you need, but going this way you'd have a temptation to
> implement an analogous API from other programming languages like e.g.
> https://apidock.com/ruby/String/scan
I am not crazy with the mapcar-like implemention either.
Actually, I have changed my mind after nice Noah suggestion.  He
mentioned the possibility of extend `cl-loop' with a new clause to iterate on
matches for a regexp.
I think this clause fits well in cl-loop; this way we don't need to
introduce a new function/macro name.

--8<-----------------------------cut here---------------start------------->8---
commit 59e66771d13fce73ff5220ce3df677b9247c9c52
Author: Tino Calancha <address@hidden>
Date:   Fri Apr 7 23:31:08 2017 +0900

    New clause in cl-loop to iterate in the matches of a regexp
    Add new clause in cl-loop facility to loop over the matches for
    REGEXP in the current buffer (Bug#26338).
    * lisp/emacs-lisp/cl-macs.el (cl--parse-loop-clause): Add new clause.
    (cl-loop): update docstring.
    * doc/misc/cl.texi (For Clauses): Document the new clause.
    * etc/NEWS: Mention this change.

diff --git a/doc/misc/cl.texi b/doc/misc/cl.texi
index 2339d57631..6c5c43ad09 100644
--- a/doc/misc/cl.texi
+++ b/doc/misc/cl.texi
@@ -2030,6 +2030,21 @@ For Clauses
 This clause iterates over a sequence, with @var{var} a @code{setf}-able
 reference onto the elements; see @code{in-ref} above.
address@hidden for @var{var} being the matches of @var{regexp}
+This clause iterates over the matches for @var{regexp} in the current buffer.
+By default, @var{var} is bound to the full match.  Optionally, @var{var}
+might be bound to a subpart of the match.  It's also possible to restrict
+the loop to a given number of matches.  For example,
+(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
+         using '(group 1 limit 10)
+         collect x)
address@hidden example
+collects the next 10 function names after point.
 @item for @var{var} being the symbols [of @var{obarray}]
 This clause iterates over symbols, either over all interned symbols
 or over all symbols in @var{obarray}.  The loop is executed with
diff --git a/etc/NEWS b/etc/NEWS
index aaca229d5c..03f6ecb88b 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -862,6 +862,10 @@ instead of its first.
 * Lisp Changes in Emacs 26.1
+** New clause in cl-loop to iterate in the matches for a regexp
+in the current buffer.
 ** Emacs now supports records for user-defined types, via the new
 functions 'copy-record', 'make-record', 'record', and 'recordp'.
 Records are now used internally to represent cl-defstruct and defclass
diff --git a/lisp/emacs-lisp/cl-macs.el b/lisp/emacs-lisp/cl-macs.el
index 25c9f99992..50596c066e 100644
--- a/lisp/emacs-lisp/cl-macs.el
+++ b/lisp/emacs-lisp/cl-macs.el
@@ -892,6 +892,7 @@ cl-loop
       the overlays/intervals [of BUFFER] [from POS1] [to POS2]
       the frames/buffers
       the windows [of FRAME]
+      the matches of/for REGEXP [using (group GROUP [limit LIMIT])]
   Iteration clauses:
     repeat INTEGER
     while/until/always/never/thereis CONDITION
@@ -1339,6 +1340,33 @@ cl--parse-loop-clause
                  (push (list temp-idx `(1+ ,temp-idx))
+               ((memq word '(match matches))
+               (let* ((_ (or (and (not (memq (car cl--loop-args) '(of for)))
+                                   (error "Expected `of'"))))
+                     (regexp (cl--pop2 cl--loop-args))
+                      (group-limit
+                       (and (eq (car cl--loop-args) 'using)
+                            (consp (cadr cl--loop-args))
+                            (>= (length (cadr cl--loop-args)) 2)
+                            (cadr (cl--pop2 cl--loop-args))))
+                      (group
+                       (or (and group-limit
+                                (cl-find 'group group-limit)
+                                (nth (1+ (cl-position 'group group-limit)) 
+                           0))
+                      (limit
+                       (and group-limit
+                            (cl-find 'limit group-limit)
+                            (nth (1+ (cl-position 'limit group-limit)) 
+                      (count (make-symbol "--cl-count")))
+                  (push (list count 0) loop-for-bindings)
+                  (push (list var nil) loop-for-bindings)
+                  (push `(re-search-forward ,regexp nil t) cl--loop-body)
+                  (push `(or (null ,limit) (and (natnump ,limit) (< ,count 
,limit))) cl--loop-body)
+                  (push (list count `(1+ ,count)) loop-for-sets)
+                  (push (list var `(match-string-no-properties ,group))
+                        loop-for-sets)))
               ((memq word hash-types)
                (or (memq (car cl--loop-args) '(in of))
                     (error "Expected `of'"))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 7, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2017-04-07
Repository revision: 67aeaa74af8504f950f653136d749c6dd03a60de

