[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Feature proposal: string extracting by RegExp for xgettext

From: Bruno Haible
Subject: Re: Feature proposal: string extracting by RegExp for xgettext
Date: Thu, 13 Mar 2008 14:00:00 +0100
User-agent: KMail/1.5.4

Aurélio A. Heckert wrote:
> I don't want to see the messages. I want to help the xgettext
> to get messages from new coding ways.
> See what i want to tell:
> =============================
> #!/usr/bin/mylang
> do someting
> print gettext #my text#
> end
> =============================
> The xgettext can't get the "my text" string on this
> strange language, but the problem is not only to
> new languages, there are a lot of languages not
> suported by xgettext and more if we think on
> templates... XML based formats can have localizable
> atributes.
> So... how the xgettext will find the gettext function
> on new codes? How it must get the string?
> We may give a regexp to the xgettext recognize
> where to get the strtings on the code.

You are talking about two topics here:

1) About the new languages: You think that you can describe languages
through regular expressions. I don't think so. Regular expressions are
a good means to do some text processing with very short development time.
But when applied to text written in a programming language, they fail.
(Take the syntax colouring of 'vim' for example. It is described by regular
expressions. It's right 95% of the time, and produces wrong results 5%
of the time.)

Also, a single regular expression will not be enough. What you would need
is some kind of programmable execution engine (possibly a state machine)
where regular expressions are only one ingredient.

If you are inventing a particular new language, and are only interested in
quick-and-dirty results, you can program your own extractor in a scripting
language like Python. In Python you already have a binding to the libgettextpo
library for creating the .pot file, so you can concentrate on your parsing.

2) About XML based formats: xgettext supports the GNOME Glade format. Its
designers soon noticed that a long hardcoded list of localizable tags was
not a good idea. In Glade 2, therefore, there is an attribute
which makes it easier to extract the localizable contents.

If you have an XML format of your own and want to produce a PO file from it,
the ideal scripting language for this task is probably XQuery. Less ideal,
but also possible, is XSLT that produces XLIFF, followed by an XLIFF to PO
converter [1].


[1] http://xliff-tools.freedesktop.org/wiki/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]