bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gettext] broken handling of unicode code point escapes in Tcl


From: Guido Berhoerster
Subject: [bug-gettext] broken handling of unicode code point escapes in Tcl
Date: Mon, 24 Jun 2013 19:24:37 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

xgettext parsing of Tcl unicode code point escapes is broken, it
tries to replace the escape with the literal unicode character but
does not consume the last character of the escape but copies it
into the output which results in corrupt .po files, e.g.:

----8<----
$ cat gettext-bug.tcl
#!/usr/bin/tclsh

package require msgcat

puts [msgcat::mc "Hello\u200e\u201cWorld\u201d"]

$ /usr/bin/xgettext -o- gettext-bug.tcl
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <address@hidden>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2013-06-24 16:24+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <address@hidden>\n"
"Language-Team: LANGUAGE <address@hidden>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: gettext-bug.tcl:5
msgid "Hello‎e“cWorld”d"
msgstr ""
---->8----

It should probably not try to substitute these escapes at all as it
results in fragile .po files with embedded control characters, see
e.g. the U+200E left-to-right mark in the above example.
--
Guido Berhoerster



reply via email to

[Prev in Thread] Current Thread [Next in Thread]