[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: I'm is really I'm

From: Karl Fogel
Subject: Re: I'm is really I'm
Date: Tue, 06 Jul 2010 22:34:29 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

Lennart Borgman <address@hidden> writes:
>Obviously this character is normally ' (char 39).
>Do we have any tool for replacing such characters in Emacs? Or is
>there a better way?

I get this problem all the time when pasting from web pages, PDFs, and
other sources of formatted text.

So I've been trying to write either a "filtered paste" or just a
function to clean up a region after pasting it.  But I'm rusty on
character representations in Emacs these days, and am having trouble
coming up with a way to represent (in Elisp source code) the characters
that most often need replacing.

Anyone who wants to play Captain Obvious on the code below, go for it.
It would be nice to give Emacs a standard solution to this common

  (defun clean-region (start end)
    "Clean up a region of text that comes from a non-plaintext source.
  Formatted sources, such as web pages and PDF documents, often contain
  characters that could be reasonably represented in plain ASCII but are
  not.  For example the characters referenced by &rdquo; and &ldquo; in
  HTML are not the same as ASCII 34 (double quote).  It is sometimes
  desirable to simply convert the formatted text to ASCII."
    (interactive "*r")
    ;; TODO: this is not working yet.  Maybe make chars, not strings,
    ;; and this might work?  Not sure.
    (let ((open-double-quote  (make-string 3 0))
          (close-double-quote (make-string 3 0))
          (funderscore        ? )
          (apostrophe         (make-string 3 0)))
      ;; I don't know any other way to make these strings besides
      ;; just setting each character by hand... but even that doesn't
      ;; seem to result in a working `replace-string' in the end.
      (aset open-double-quote 0 ?â)
      (aset open-double-quote 1 128)
      (aset open-double-quote 2 156)
      (aset close-double-quote 0 ?â)
      (aset close-double-quote 1 128)
      (aset close-double-quote 2 157)
      (aset apostrophe 0 ?â)
      (aset apostrophe 1 128)
      (aset apostrophe 2 153)
        (goto-char start)
        (replace-string apostrophe "'"  nil start end))))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]