[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bug#493218: gettext: crash with some unicode chars (fwd)

From: Bruno Haible
Subject: Re: Bug#493218: gettext: crash with some unicode chars (fwd)
Date: Sun, 3 Aug 2008 22:04:24 +0200
User-agent: KMail/1.5.4


Yann <address@hidden> wrote:
> to reproduce, just open a file test.py with only u'\udfff' in it, and
> run xgettext t.py
> we get a Aborted message

Find attached the fix that I just committed. Thanks for the report.

> This string isn't translatable, so why xgettext parse it? And why does
> it fail?

xgettext's logic would be more complex if it was parsing only when deemed
"necessary". It's simpler to parse all identifiers and strings into a stream
of tokens first.


2008-08-03  Bruno Haible  <address@hidden>

        * x-python.c (mixed_string_buffer_append): Replace a lone high
        surrogate with U+FFFD.
        Reported by Yann <address@hidden>
        via Santiago Vila <address@hidden>.

*** x-python.c  20 Apr 2008 05:23:52 -0000      1.32
--- x-python.c  3 Aug 2008 19:56:58 -0000
*** 930,935 ****
--- 930,940 ----
          if (c >= UNICODE (0xd800) && c < UNICODE (0xdc00))
            bp->utf16_surr = UNICODE_VALUE (c);
+         else if (c >= UNICODE (0xdc00) && c < UNICODE (0xe000))
+           {
+             /* A half surrogate is invalid, therefore use U+FFFD instead.  */
+             mixed_string_buffer_append_unicode (bp, 0xfffd);
+           }
            mixed_string_buffer_append_unicode (bp, UNICODE_VALUE (c));

reply via email to

[Prev in Thread] Current Thread [Next in Thread]