re-search-backward does not properly report starting point or matching s

Subject: re-search-backward does not properly report starting point or matching string
Date: Thu, 09 Oct 2003 11:16:34 -0700
In GNU Emacs 21.2.1 (i386-debian-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2002-03-22 on raven, modified by Debian
configured using `configure i386-debian-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib --infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes --with-x=yes --with-x-toolkit=athena --without-gif'
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: nil
  locale-coding-system: nil
  default-enable-multibyte-characters: nil

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

Eval the following, trivial, test function:

(defun z-test ()
  (re-search-backward "[a-z]+")
(message (format "point: %d, beginning: %d, end: %d, string: %s" (point) (match-beginning 0) (match-end 0) (match-string 0)))

Then, place cursor at the end of the string "sdfds" and run
z-test. A message like this will show up:

point: 1446, beginning: 1446, end: 1447, string: s

It reports only on the last character matched by the pattern.

It seems to me that re-search-backward works conceptually differently
from an search-backward, search-forward and re-search-forward. It does
not "advance" (backwards) the cursor to the real beginning of the
pattern, like the other functions, including search-backward do (see
more on this at *1*). But, whatever justification there might be for
that (if there is any), clearly, the "match-string" is incorrect.

Interestingly, the interactive version of the function that highlights
the matched string as you type it can properly identify the boundaries
of the correct "match-string". But even that is not always
bug-free. Applying the trivial expression "a+b" to the string "aaab"
only highlights the last "a" and the "b". As it is my understanding,
regexps should match the largest possible matching set of
characters. Using just "a+" as the expression will properly highlight
all the "a"s, of course.

(*1*) Not placing the cursor at the beginning of the pattern is bad
for a number of reasons: (a) it makes it inconsistent with the
behaviour of the other three functions; (b) it prevents the use of it
for standard scripting like "(set-mark (point)) (re-search-backwards
"a+") (kill-region (mark) (point))"; (c) it is outright silly that
interactivelly doing a backwards search for "a+" would need 10 hits of
"CTRL-R" to get past the string "aaaaaaaaaa".

Thanks for the nice work with my beloved emacs,


