[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24206: 25.1; Curly quotes generate invalid strings, leading to a seg

From: Eli Zaretskii
Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault
Date: Wed, 17 Aug 2016 18:12:48 +0300

> Cc: address@hidden, address@hidden, address@hidden,
>  address@hidden
> From: Paul Eggert <address@hidden>
> Date: Tue, 16 Aug 2016 14:07:05 -0700
> The changes were motivated by bug fixes, not style.

That's not what I see.  E.g., this hunk simply replaces valid code by
an equivalently valid code:

  -       if (multibyte)
  -         {
  -           int len;
  -           STRING_CHAR_AND_LENGTH (strp, len);
  -           if (len == 1)
  -             *bufp = *strp;
  -           else
  -             memcpy (bufp, strp, len);
  -           strp += len;
  -           bufp += len;
  -           nchars++;
  -         }
  -       else
  -         *bufp++ = *strp++, nchars++;
  +       /* Fall through to copy one char.  */

Same here:

  -      else if (strp[0] == '\\' && strp[1] == '[')
  +      else if (strp[0] == '\\' && strp[1] == '['
  +            && (close_bracket
  +                = memchr (strp + 2, ']',
  +                          SDATA (str) + strbytes - (strp + 2))))
  -       ptrdiff_t start_idx;
            bool follow_remap = 1;

  -       strp += 2;            /* skip \[ */
  -       start = strp;
  -       start_idx = start - SDATA (string);
  -       while ((strp - SDATA (string)
  -               < SBYTES (string))
  -              && *strp != ']')
  -         strp++;
  -       length_byte = strp - start;
  -       strp++;               /* skip ] */

and here (which, for some reason, loses part of a comment, and IMO
makes it half a riddle for the uninitiated):

  -       /* Note the Fwhere_is_internal can GC, so we have to take
  -          relocation of string contents into account.  */
  -       strp = SDATA (string) + idx;
  -       start = SDATA (string) + start_idx;
  +       /* Take relocation of string contents into account.  */
  +       strp = SDATA (str) + idx;
  +       start = strp - length_byte - 1;

etc. etc. -- I see a lot of changes that have nothing to do with the
real bugs in this function, they just rearrange valid code, change the
way intermediate variables are used, etc.

> Although the bugs were mostly minor (e.g., generating bogus NUL bytes due to 
> miscounting) it's fine to fix minor bugs. I did change nearby style 
> (indenting as per GNU style, switching some locals to C99-style 
> decl-after-statement, etc.) but none of the changes were pervasive or were 
> intended for the emacs-25 branch, and it's fine to make such changes in 
> master.

What code generated bogus null bytes?

I'm not saying it isn't fine to make such changes, I'm urging you and
the others to resist the temptation of doing so unless really
necessary.  We are operating in the area of diminishing returns, and
too many times introduce regressions into code that was working
properly for decades.  We should try to minimize that.  Emacs is not
supposed to become less stable in core code, unless its gets
significant improvements or new features.

> One of the bugs was O(N**2) performance when reallocating a temporary buffer. 
> While I was at it, I changed the code to allocate a small temp buffer on the 
> stack to avoid a malloc/free in the usual case, which should be a small win. 
> This accounts for many changes that a quick glance might give the mistaken 
> impression of being stylistic.

Where's the O(N**2) performance, and why does performance matter in
this function anyway?  I don't think we ever had complaints about this
being slow.  The new code is more complex, because it sometimes uses
the stack and sometimes the heap, so more opportunities for bugs in
it.  I don't see any net gains, sorry.

>     without any comments as to why we handle the input string as
>     multibyte for the rest of the function, I think this will confuse
>     someone down the road.
> OK, I added some comments along those lines (see attached patch).


>     I think the refactoring already introduced a
>     regression.
> This comment appears to be about changes made in May for Bug#23425, not about 
> code changes I made recently. The May changes were not a regression; they 
> were intended and are documented in etc/NEWS. Alan Mackenzie felt strongly 
> that some changes were needed in this area. See commit 
> 433d366dc7b053048abf710d790ff62421dd1570.

Right, sorry I forgot about that.  Unlike at that time, I now think
this was a bad move, because Emacs 25.1 will have the disabled
conversion in it, so by the time we release the code in master, it
would be an incompatible change.  Also, ‘..’ is left unchanged, but
'..' is not, which is inconsistent.  So I think that change should be
reverted on master.  (I also don't see how it is related to the
original bug report, which AFAIU was about (message "`foo'") that
still behaves as in the bug report.)

>     maybe we should just go to the original code, after fixing the
>     immediate bugs, as I proposed in a patch yesterday?
> No, the code in master should be uniformly better than what's in emacs-25.

I can't say I see that, sorry.

> It would be nice to have good tests for substitute-command-keys, of course. 
> (We can all add this to our lists of things to do. :-) 

Which seems to be just another way of saying NO, sigh.

(Mumbles something about Emacs maintenance being a lonely business...)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]