bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13032: 24.3.50; Request: Provide a `delete-duplicate-lines' command


From: Juri Linkov
Subject: bug#13032: 24.3.50; Request: Provide a `delete-duplicate-lines' command
Date: Sun, 02 Dec 2012 02:45:44 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (x86_64-pc-linux-gnu)

> * I'm thinking that the ADJACENT argument is kinda unnecessary.  I
> can't think of a use-case where someone wants to remove only the
> _adjacent_ duplicate lines but not the ones which aren't adjacent.
> So, I think that both the interface and the implementation could be
> simplified by removing that argument.

The ADJACENT argument is an optimization that doesn't require
additional memory (to store previous lines in the cache).
This is necessary when the user needs to delete duplicate lines
in a large sorted file.

> * Why is needed the INTERACTIVE argument?  I mean, Cannot that info
> (whether the function has been called interactively) be retrieved
> using some Lips primitive?

There is called-interactively-p but as I understood, it is unreliable.
This is why other similar commands like `flush-lines', `keep-lines',
`how-many' use the INTERACTIVE argument.  They use it for two purposes:
to decide whether the active region should be used, and to decide whether
the message should be displayed when called interactively.

> * In case the INTERACTIVE argument is indeed necessary, it should be
> explained in the docstring, no?

Yes, below I copied this part from the docstring of `how-many'.

> * I think that the docstring should explain also the return value
> (number of duplicate lines deleted).

Coincidentally, the return value will be explained in the same part
of the docstring.

The remaining problem is to decide where to put this command?
The file replace.el is unsuitable because unlike `flush-lines' and
unlike `how-many', `delete-duplicate-lines' doesn't use regexps.

It seems the right place is sort.el because it also contains a related
command `reverse-region'.  This patch puts `delete-duplicate-lines'
after `reverse-region' at the end of sort.el:

=== modified file 'lisp/sort.el'
--- lisp/sort.el        2012-08-03 08:15:24 +0000
+++ lisp/sort.el        2012-12-02 00:44:42 +0000
@@ -562,6 +562,59 @@ (defun reverse-region (beg end)
        (setq ll (cdr ll)))
       (insert (car ll)))))
 
+;;;###autoload
+(defun delete-duplicate-lines (rstart rend &optional reverse adjacent 
interactive)
+  "Delete duplicate lines in the region between RSTART and REND.
+
+If REVERSE is nil, search and delete duplicates forward keeping the first
+occurrence of duplicate lines.  If REVERSE is non-nil (when called
+interactively with C-u prefix), search and delete duplicates backward
+keeping the last occurrence of duplicate lines.
+
+If ADJACENT is non-nil (when called interactively with two C-u prefixes),
+delete repeated lines only if they are adjacent.
+
+When called from Lisp and INTERACTIVE is omitted or nil, return the number
+of deleted duplicate lines, do not print it; if INTERACTIVE is t, the
+function behaves in all respects as if it had been called interactively."
+  (interactive
+   (progn
+     (barf-if-buffer-read-only)
+     (list (region-beginning) (region-end)
+          (equal current-prefix-arg '(4))
+          (equal current-prefix-arg '(16))
+          t)))
+  (let ((lines (unless adjacent (make-hash-table :weakness 'key :test 'equal)))
+       line prev-line
+       (count 0)
+       (rstart (copy-marker rstart))
+       (rend (copy-marker rend)))
+    (save-excursion
+      (goto-char (if reverse rend rstart))
+      (if (and reverse (bolp)) (forward-char -1))
+      (while (if reverse
+                (and (> (point) rstart) (not (bobp)))
+              (and (< (point) rend) (not (eobp))))
+       (setq line (buffer-substring-no-properties
+                   (line-beginning-position) (line-end-position)))
+       (if (if adjacent (equal line prev-line) (gethash line lines))
+           (progn
+             (delete-region (progn (forward-line 0) (point))
+                            (progn (forward-line 1) (point)))
+             (if reverse (forward-line -1))
+             (setq count (1+ count)))
+         (if adjacent (setq prev-line line) (puthash line t lines))
+         (forward-line (if reverse -1 1)))))
+    (set-marker rstart nil)
+    (set-marker rend nil)
+    (when interactive
+      (message "Deleted %d %sduplicate line%s%s"
+              count
+              (if adjacent "adjacent " "")
+              (if (= count 1) "" "s")
+              (if reverse " backward " "")))
+    count))
+
 (provide 'sort)
 
 ;;; sort.el ends here






reply via email to

[Prev in Thread] Current Thread [Next in Thread]