emacs-humanities
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-humanities] How can I `upcase-word` for ALL words in the buff


From: Marcin Borkowski
Subject: Re: [emacs-humanities] How can I `upcase-word` for ALL words in the buffer without typing the word twice?
Date: Wed, 19 Apr 2023 22:13:10 +0200
User-agent: mu4e 1.1.0; emacs 30.0.50

On 2023-04-10, at 07:55, David Hedlund <public@beloved.name> wrote:

> I'd be very happy if you would like to help me to write a function for this.

The code (with explanation) can be found at
https://mbork.pl/2023-04-15_Downcasing_word_at_point_in_the_whole_buffer
For the record, I yank it (in Org mode syntax) below.

--8<---------------cut here---------------start------------->8---
Some time ago, one of the members of the 
[[https://lists.gnu.org/mailman/listinfo/emacs-humanities][Emacs-humanities]] 
mailing list
[[https://lists.gnu.org/archive/html/emacs-humanities/2023-04/msg00000.html][mentioned
 a very specific problem]].  He wants to be able to replace all
occurrences of the word at point with its upercase (or lowercase)
variant.  This is one of these things that can be solved with a bit of
custom Elisp.  Being a big fan of writing small (or sometimes not so
small) helper functions to make editing easier, I offered to do a bit
of coding to accomplish this task, and here it is.  Being a teacher,
I’d like to provide at least a short explanation, too.  (Most of you
probably know that I spent quite some time in 2021 doing exactly this
– coding Elisp and explaining it – and the result is my book 
[[https://leanpub.com/hacking-your-way-emacs][Hacking
your way around in Emacs]], designed as a “next step” after Robert
J. Chassell's excellent 
[[https://www.gnu.org/software/emacs/manual/html_node/eintr/index.html][An 
introduction to programming in Emacs Lisp]].
Check out Chassell's book if you are interested in learning Elisp, and
then my book if you ant to go further!)

First of all, we need to be able to know the word at point.  This one
is easy – you can say, well, ~(word-at-point)~.  This function comes
with Emacs, although it is /not/ defined right away – you need to
~require~ the ~thingatpt~ /feature/.
#+begin_src elisp
  (require 'thingatpt)
#+end_src

The next step is to downcase the word given by ~word-at-point~.  If
you invoke the ~apropos-function~ command and find all the functions
containing the string ~down~, you will quickly find the ~downcase~
function, accepting a string and returning its lowercase version.
(There are other downcasing functions in Emacs, but most of them
operate on the current buffer instead of on a string.)  Note that it
is often a good idea to look for functions in a fresh Emacs without
any package loaded/init file evaluated (you can start such an instance
with ~emacs -Q~) – in my Emacs, there are 138 functions containing the
string ~down~, but only 24 in an ~emacs -Q~ session.

Now what we want is a loop, replacing the word at point with its
lowercase version until the end of buffer.  Since we need to do this
also before point, we’ll start with ~(goto-char (point-min))~; since
we don’t want the user to notice that we were moving the point around,
we’ll wrap it in ~save-excursion~.  (We /could/ do without a loop,
using the ~replace-string~ function.  However, its docstring advises
us to use it only interactively and use ~search-forward~ and
~replace-match~ in Elisp code.  The reason is that it has side effects
like printing to the minibuffer and setting the mark.)

Now, my first attempt looked like this.
#+begin_src elisp
  (save-excursion
    (goto-char (point-min))
    (while (not (eobp))
      (search-forward word nil 'move)
      (replace-match downcased-word t t)))
#+end_src
Notice the ~'move~ parameter in ~search-forward~ invocation.  If the
third parameter is omitted or nil, it means signal an error when the
searched term is not found (not what we want).  If it is ~t~, no error
is signaled, but the point does not move – not useful for us, either,
since we need the point to move to the end of buffer for ~eobp~ to end
the loop.  It turns out that we can use any other value (here I used
the symbol ~'move~, but I could use, say, the string ~"move"~ or the
number 0) to tell ~search-forward~ to move the point to the bound of
the search.

Also, providing ~t~ as the second and third parameter of
~replace-match~ is important.  The first time ~t~ is used here it
prevents ~replace-match~ from case conversion (which is the whole
point of our code).  The second time it disables special characters
connected with /regex/ replacement (and we do not want that, either).

This, however, didn’t work – it just hangs.  At first I didn’t know
why, but I ran it through 
[[https://www.gnu.org/software/emacs/manual/html_node/elisp/Edebug.html][Edebug]]
 and the reason soon became apparent.
The ~replace-match~ function moves the point to the end of the
replacement string.  As long as ~search-forward~ /finds/ the next
instance of ~word~, that is fine – then, ~replace-match~ just leaves
the point where ~search-forward~ put it.  But when ~search-forward~
does /not/ find ~word~, it puts the point at the end of the buffer
(because we asked it to do so!), and then ~replace-match~ puts it
/back/ where it was previously – so the condition in ~while~ is always
true and the loop never exists.  The solution is obvious (and in fact,
it was even shown in the docstring of ~replace-string~!):
#+begin_src elisp
  (save-excursion
    (goto-char (point-min))
    (while (search-forward word nil t)
      (replace-match downcased-word t t)))
#+end_src
This uses the fact that ~search-forward~ returns a non-nil value when
it finds the given string, and nil otherwise (provided its third
parameter is non-nil – if it is nil or omitted, it signals an error
then, which is not the behavior we want).

Of course, we need the variables ~word~ and ~downcased-word~ to
contain the right things.  The Elisp way to set temporary variables
like that is the ~let~ clause.  In this case, we’ll need ~let*~, since
the definition of ~downcased-word~ needs ~word~, defined earlier in
the same clause.  (We could nest ~let~ clauses, but ~let*~ is more
concise.)  An additional advantage of using clauses like ~let*~ or
~let~ is that we can temporarily set ~case-fold-search~ to nil, and
its previous value will reappear when ~let~ is done without any other
action on our side.  (This is what we need to make the search
case-sensitive.)
#+begin_src elisp
  (let* ((word (word-at-point))
         (downcased-word (downcase word))
         (case-fold-search nil))
    (save-excursion
      (goto-char (point-min))
      (while (search-forward word nil t)
        (replace-match downcased-word t t))))
#+end_src

This code has one drawback – it will also replace instances of ~word~
being /parts/ of longer words.  This is probably not what we need.  To
overcome this limitation, let’s search for a regular expression
instead, and add a ~\b~ at both ends.  One problem with this approach
is that ~word~ might contain some characters interpreted in a special
way in a regex.  Fortunately, Elisp has the ~regexp-quote~ function
which escapes them, so e.g. ~(regexp-quote ".*")~ evaluates to the
string ="\\.\\*"=.  Also, let’s wrap our code into a ~defun~ so that
we can actually use it interactively.
#+begin_src elisp
  (require 'thingatpt)
  (defun downcase-instances-of-word-at-point ()
    "Convert every instance of the word at point to lowercase."
    (interactive)
    (let* ((word (word-at-point))
           (downcased-word (downcase word))
           (case-fold-search nil))
      (save-excursion
        (goto-char (point-min))
        (while (re-search-forward
                (format "\\b%s\\b" (regexp-quote word)) nil t)
          (replace-match downcased-word t t)))))
#+end_src

And this is it.  Of course, this is Emacs, so we can tinker more.  For
example, we could introduce another variable, say ~count~, initialize
it to 0 and then increment it every time we do a replacement, and
display the replacement count at the end.  We could allow the user to
/edit/ the word to be replaced, suggesting the word at point as the
default.  These possibilities are left as an exercise for the
reader​~;-)~.  Happy hacking!
--8<---------------cut here---------------end--------------->8---

Best,

-- 
Marcin Borkowski
http://mbork.pl



reply via email to

[Prev in Thread] Current Thread [Next in Thread]