bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Gettext bug when translating strings from iso88592-2


From: Tomasz Torcz
Subject: Gettext bug when translating strings from iso88592-2
Date: Sun, 22 Aug 2004 22:37:18 +0200
User-agent: Mutt/1.5.4i

 Hi

 I've encountered bug in gettext when translating program with string
written in iso-8859-2. I will provide testcase in this email:

 Program has strings in latin2 (iso-8859-2):

#include <stdio.h>
#include <libintl.h> 
#include <locale.h>

#define _(String) gettext(String)


int main() { 
        char *good = _("good");
        char *bad = _("bąd"); /* contains characte in iso 8859-2 */
        
        setlocale(LC_ALL, ""); 
        textdomain("example"); 

        printf("%s - %s\n", good, bad);
        
        exit(0); 
}

 Working is obvious:
% ./a.out 
good - bąd
%

 Extracting files for translation (xgettext) fails:

% xgettext -d example -o example.po -k_ ex.c 
xgettext: Non-ASCII string at ex.c:10.
          Please specify the source encoding through --from-code.

 The error message is true - there are non-ASCII letters in "bąd".
So let's take advice and specify encoding:

% xgettext -d example -o example.po -k_ --from-code=iso-8859-2 ex.c
%

 Seems OK, BUT resulting file example.po contains UTF8 strings. Let's
translate them neverthless:

#: ex.c:9
msgid "good"
msgstr "PASS"

#: ex.c:10
msgid "bÄ~Ed"
msgstr "FAIL"


% msgfmt example.po -o example.mo
%

 Then we copy resulting file to .../ex/LC_MESSAGES/example.mo, and try
our new translation:

% LC_ALL=ex LANG=ex ./a.out
PASS - bąd
%

 Ooops. Only one string got translated. What happened? It seems that
gettext got fooled by unicode. Original string "bąd" got mangled to
"bÄ~Ed" in .po file. "bąd" and "bÄ~Ed" are recognized by gettext as two
different strings and translation isn't made.

 The situation would be cured if example.po got:

#: ex.c:10
msgid "bąd"
msgstr "FAIL"

 But getting that output from xgettext is impossible - it insist on
using --from-code, which in turn mangles strings.

 This bug is really a showstopper in translation of our project. Version
used:

% xgettext --version
xgettext (GNU gettext-tools) 0.14.1

And glibc-2.3.2.

 (I'm not subscribed, please keep me in CC list)

-- 
Tomasz Torcz                Only gods can safely risk perfection,     
address@hidden     it's a dangerous thing for a man.  -- Alia





reply via email to

[Prev in Thread] Current Thread [Next in Thread]