bug-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GSFromUnicode stack trashing bug


From: Wim Oudshoorn
Subject: GSFromUnicode stack trashing bug
Date: Wed, 18 Jan 2006 12:16:49 +0100
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/22.0.50 (darwin)

There is a bug in the function GSFromUnicode in the file Unicode.m.
This bug can corrupt the stack.  It is a little tricky to explain,
so bear with me :-)


First look at the code fragment

      while (dpos + sl >= bsize)
        {
          GROW ();
        }

      if (sl == 1)
        {
          ptr [dpos++] = u & 0x7f;
        }

This occurs around Unicode.m:1836.
Here 

  dpos = index in destination buffer, pointed to by
  ptr  = the base pointer of the destination buffer
  sl   = the lenght of the UTF8 encode character that needs to be 
         written to the destination buffer.
  bsize = the length of the destination buffer.

The check in the while condition is off by one, look at the following example:

dpos = 0,
sl = 1,
bsize = 1,

So we have a buffer of lenght 1, and the character will be written at index 0.  
So there is space enough.  However we will still grow the buffer. 
(No this is not needed to accomodate the terminating null character.)

Of course this is not very serious, you just grow the buffer to soon.  
That would be true, if the GROW macro does always what you expect.  Lets look
at the GROW macro:

  if (dst == 0) \
    { \
      /* \
       * Data is just being discarded anyway, so we can \
       * adjust the offset into the local buffer on the \
       * stack and pretend the buffer has grown. \
       */ \
      if (extra == 0) \
        { \
          ptr -= BUFSIZ; \
          bsize += BUFSIZ; \
        } \
      else \
        { \
          ptr -= BUFSIZ-1; \
          bsize += BUFSIZ-1; \
        } \
      } \
   else if (zone == 0) \
     ....


Here you see that if dst == 0 the result is discarded.  Instead of
not converting the function in question just reuses a fixed buffer and
cycles through it.  
So in our case above, assume in addition that

extra = 0
BUFSIZ = 1

Now you will see that GROW has the disastrous effect of 
FIRST substracting 1 from ptr before writing in that address the 
unicode character.  So you write BEFORE the beginning of the buffer allocated 
on the stack.

This already happens when the encoding lenght of the character is 1.  If the 
encoding length is
longer the problem is not as easily fixed as just fixing the check in the while 
loop.

I guess that the best way of fixing this is by letting go of the GROW dst == 0 
hack, and
just write macros that append a byte at the destination buffer.   But if 
someone has
other opinions, please let me know.

Wim Oudshoorn.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]