[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GNUCTT] Fwd: GNUnited Nations effective as of mid Jun

From: 狗狗
Subject: [GNUCTT] Fwd: GNUnited Nations effective as of mid Jun
Date: Sat, 7 Jun 2008 21:43:18 +0800

---------- Forwarded message ----------
From: Yavor Doganov <address@hidden>
Date: Sun, 01 Jun 2008 23:55:22 +0300
Subject: GNUnited Nations effective as of mid Jun
To: address@hidden
Cc: Peter Brown <address@hidden>, address@hidden, Brett Smith
<address@hidden>, Richard Stallman <address@hidden>, Sylvain Beucler

\/\/ drumrolls \/\/

[ Webmasters: Probably only "Information for webmasters" would be of
  interest to you.  I am also CC-ing rms, brett, peterb and beuc just
  for their info.]

After 10 months of development (with the usual interruptions, delays,
failures and disappointments), I am pleased to announce that I plan to
upload the GNUN code on June 13th and set up a daily cronjob shortly
after that.

This is the final moment to raise objections.  The concept and the
code was reviewed in March by Karl Berry, Matt Lee, John Sullivan,
Joshua Gay and D. E. Evans.  Richard Stallman and Peter Brown are
aware of this effort since quite some time, although they know nothing
about the (technical) details.  The current code can be checked out
with this command:

cvs -d:pserver:address@hidden:/sources/trans-coord \
  co trans-coord/gnun

This is exactly how the `www' repository will look like; with GNUN's
own files under /server/gnun and each (sub-)directory having its own
/po sub-directory.

The full manual, describing the revamped translation process including
how to start/join/manage a team, etc. will be available at
http://gnu.org/software/trans-coord/manual within a week or so.  I
will delete all scattered and outdated files at
/server/standards/translations and install a symlink to the new united

What is GNUN and what problems does it solve?

GNUN is a suite to assist in maintaining translations of gnu.org
articles.  The nature and purpose of GNUnited Nations is best
explained in several nodes in gnun.texi; the text below is only a
short and deliberately simplistic review.

GNUN addresses the problem that the world is a morass, and that
typically people prefer drowning in it than trying to reach firm
ground and turn the morass into a pleasant grassland.  Maintaining a
part of the world, namely the gnu.org translations, is not mission
impossible, but mission rather frustrating.  When more languages come
into the picture, it may even become close to impossible (for some
morass-diving coordinators like yours truly it is already a
substantial burden).  GNUN's task is to clean up the mess at least
partially, and lay the foundation for other achievements.

If you translate a free program and then the developer changes a
string, your outdated translation is discarded at runtime (i.e. the
original English string is displayed), and this is a major feature --
if localized programs contained translations that do not match the
original, they would be unusable in these locales, in most cases.

Similarly, GNUN applies this approach to gnu.org articles.  If the
original English article changes, a part of the translated one(s) will
be replaced with the original text.  Translators will then update
those articles that must be updated -- it is easy to find them, and
only a keystroke in your favorite PO editor will take you to the part
that has to be updated.

To illustrate why this is important and desirable for gnu.org
articles, please open http://gnu.org/philosophy/free-sw.html and visit
all of its translations.  You will observe that although this is an
article that doesn't change often, many of the translations are
outdated, and even those that are in sync (here comes the question how
one can determine this!) create a "rusty" feeling among readers as
they are not updated to the current gnu.org style.

GNUN's power is most visible for articles that are dynamic in nature,
such as the homepage, the main Philosophy page, the GPL FAQ, the
GNU/Linux FAQ, license-list and others.

GNUN does not translate essays automatically -- the core part of this
job is still as hard as it is, and it'll be that way even when an AI
is developed.  GNUnited Nations is merely a simple infrastructure
enabling you to check what needs to be done, and to get it done
quickly and efficiently.  It is also a tool that will prevent
accumulating rot when you take, say, a 3 or 6-months break.  It is
easy to bring the translations up-do-date afterwards, and meanwhile
they will not contain incorrect information.  For example, when
Windows is released under the GPL and all the original articles are
updated to reflect that fact -- all translations under GNUN's control
will be rebuilt automatically.  Bugs (still not fixed, BTW) like the
outdated translations wrt Sun's Java will remain in history.

This will also automatically solve a major problem we have now: how to
update or fix such bugs in translations that have no team established
or the team is currently orphaned.

Site-wide changes to the templates, like "urgent" notices, propagate
to ALL translations managed by GNUN -- IOW, once all (or almost all)
translations are migrated under GNUN's control, gnu.org will behave as
one entity; touching the right original files will trigger a rebuild
of the files dependent on them, and can change the whole look/style in
a breeze, for all languages involved.  This is possible because of the
second most adorable and amazing software ever developed, GNU Make.
(The first place had always been taken by GNU Emacs -- just in case
you didn't know this.  No smiley here, I'm completely serious.)

The main feature of GNUN is to keep translations in sync with the
original.  Translators' convenience is only a secondary goal, but that
goal is achieved more or less in a natural way as it is inherently
related to the main one.

A PO file provides a natural way to proof-read a translation as the
msgid (the original) appears right above the msgstr (the translation).
For relevantly short paragraphs, one can easily check the translation
of each item in Emacs (or any other PO editor, FWIW) without using
extra slightly inconvenient techniques like Follow mode or switching
buffers 50 times during the review.  Just for the record, I had not
looked at any Bulgarian HTML markup since January and I feel very,
very well as a coordinator of my team.  I tend to think about our HMTL
files as `configure' scripts generated by `configure.ac' (i.e. the PO

GNUN provides an extra makefile that would enable teams to use their
own Savannah repository for draft translations.  Merges/updates from
"www" and reports are done automatically provided that you enable a
cronjob for that.  The goal behind this feature is to improve the team
spirit and the importance of peer reviews, and also to "download" some
of the burden off the shoulders of the team leaders.  A team leader
does and must do the final reviews, but it seems that team leaders
currently do ALL reviews -- this leads to excessive stress in some
teams with huge membership, leading to malfunction and frustration.
Proof-reading of translations ideally should be a collective effort,
and GNUN makes it easier to achieve that goal.

In essence, GNUnited Nations is currently a (simple) build system
whose purpose is to keep translations always up-to-date.  We will
extend it further in order to match its pompous name and translators'

The policy for GNUN development is to use only GNU packages, if
possible (the only non-GNU package is Po4a, which does essential part
of the job), and to follow the GNU Coding Standards.  This decision is
not only because of loyalty (although that reason alone is completely
valid and more than enough), but also because it is the most natural
thing on Earth the translations of gnu.org essays to be handled by GNU
(Hopefully this will put an end to objections like "Why GNUN is
written in GNU Make and Guile and not in SCons/Waf/whatever and
Lua/Haskell/whatever" and "Why the documentation is in Texinfo and not
Halibut/Yodl/whatever".  This is also the reason why all examples in
the manual are for GNU Emacs and gettext, and not other editors and

Information for webmasters

Nothing will change in your workflow radically, and care has been
taken that old-timers with well established habits and tradition in
maintaining gnu.org pages do not have to adopt to this new system.  So
generally, you'll do what you have been doing so far.

There are some consequences of your actions and things to be aware of:

* If you modify an article, you implicitly modify its translations,
  i.e. the changes you make will appear verbatim in all translated
  articles which are under GNUN's control within 24 hours.  This is
  much like when you change a string in a program and then run `make

* All characters in an English article must be ASCII + HTML entities.
  This is because the English text gets incorporated verbatim in the
  translation when a string is "new" or "fuzzy" and in a non-UTF-8
  translation you get broken rendering when part of it is English +
  UTF-8-encoded characters.

  This has been the practice so far (more or less), so I guess it
  won't disturb you much.  I'll take care to fix any problems that

* Nags and nitpicking.  Since the original English HTML articles are
  the canonical source for various manipulations and mechanical
  transformations, we do the usual sanity check of validating them
  before doing anything.  Historically, invalid articles at gnu.org
  have always been considered a bug.  Now you'll get a warning mail
  when an article is not valid due to some change someone has made;
  the build fails in this case.  The email message will contain
  information where exactly the error occurs and what it entails.
  Most probably I'll fix all of these errors before the next GNUN
  build, so you don't have to worry too much.

  For those who generally care about validation, there is a simple
  script `validate-html' which you can use to check the validity of an
  article before committing (usually after making extensive changes,
  where you doubt if the result would validate).  It is gnu.org-aware,
  so running f.i. `validate-html licenses/license-list.html' in your
  working copy will expand all directives (thanks to the most romantic
  macro processor GNU M4) and will spit nothing if valid, and error
  with details on stdout if invalid.

  In addition, there is a different kind of error even when the
  article is valid XHTML.  GNUN relies on some regexps based on
  boilerplate.html to strip out common parts that don't have to be
  translated but must be re-added automatically for all languages.  So
  if you divert too much from that to the extent that the
  `make-prototype' script cannot build a skeleton for the POT from the
  XHTML source, you'll get a Guile backtrace with full details of the

  I don't expect such things to happen often, and these errors will be
  mailed to www-discuss just in case I'm not available -- someone else
  could fix the bug then.

* Basically all of GNUN's rules rely on boilerplate.html and the
  current SSI-based structure/layout.  Any changes/reorganization of
  the templates must be tested first and GNUN's recipes/scripts
  adjusted accordingly.  Otherwise things will break.  Adding a new
  menu or modifying a text is generally safe.  If you plan to change
  the style or make any significant change, please inform
  <address@hidden> first, preferably with a diff of the
  changes.  Failing that will unleash the powerful wrath of a wild and
  primitive Balkanian.

Transition to GNUN

The conversion of the existing HTML translations to PO format is not a
hard process, but it is undoubtedly a bit tedious.  The good news is
that 1) it must be done only once for an article; 2) you can convert
almost automatically a translation in HTML 2.0 or 4.0 to the current
gnu.org-adopted standard (XHTML 1.0) with the server templates.  So
all in all, the job is less compared to the conversion even of an
XHTML-based translation to the new SSI-based layout.  The crucial part
in this conversion process is to make sure that your translation is
100% match to the original -- further changes will be tracked
automatically, but your initial translation must be correct.

GNUN will be compulsory for all teams in order to make things
manageable and bearable for the web-translators staff.  I believe
there is a huge benefit for the translators as well.  The current
system has been tested with Afrikaans, Bulgarian, Catalan, Spanish,
Portuguese, Brazilian Portuguese, Russian and Simplified Chinese
translations.  I'd wish we had a test case for Indic and RTL
languages, but such is life.

Here is the deployment/migration plan:

15 Feb 2008     All translations of prospective team leaders are
                submitted as PO files.

22 Mar 2008     Official testing period in trans-coord's repository.

13 Jun 2008     GNUN instance operating at www.

15 Sep 2008     All new translations MUST be installed as PO files.

31 Dec 2008     Converted articles are close to 50%.

30 Jun 2009     More than 75% of all translations are under GNUN's

30 Jun 2010     100% coverage.

The following table shows how much work has to be done per team.

|LANG |Mundane name  |Articles| PO  |
|     |              |        |files|
| af  |Afrikaans     |  6     |  5  |
| ar  |Arabic        |  9     |     |
| az  |Azerbaijani   |  4     |     |
| bg  |Bulgarian     | 25     | 25  |
| bn  |Bengali       |  3     |     |
| bs  |Bosnian       | 13     |     |
| ca  |Catalan       |103     | 10  |
| cs  |Czech         | 56     |     |
| da  |Danish        |  6     |     |
| de  |German        | 54     |     |
| el  |Greek         | 30     |     |
| eo  |Esperanto     |  1     |     |
| es  |Spanish       |138     |  6  |
| fa  |Farsi         | 11     |     |
| fi  |Finnish       |  6     |     |
| fr  |French        |233     |     |
| gl  |Galician      |  3     |     |
| he  |Hebrew        | 21     |     |
| hr  |Croatian      |  9     |     |
| hu  |Hungarian     | 10     |     |
| id  |Indonesian    | 40     |     |
| it  |Italian       | 80     |     |
| ja  |Japanese      | 61     |     |
| kn  |Kannada       |  6     |     |
| ko  |Korean        | 75     |     |
| mk  |Macedonian    |  7     |     |
| nl  |Dutch         | 84     |     |
| nb  |Norwegian     | 13     |     |
|     |Bokmål        |        |     |
| nn  |Norwegian     |  8     |     |
|     |Nynorsk       |        |     |
| pl  |Polish        |132     |     |
| pt  |Portuguese    |  5     |  4  |
|pt-br|Brazilian     | 74     |  4  |
|     |Portuguese    |        |     |
| ro  |Romanian      | 54     |     |
| ru  |Russian       | 55     |  8  |
| sk  |Slovak        |  1     |     |
| sl  |Slovenian     |  4     |     |
| sq  |Albanian      | 10     |     |
| sr  |Serbian       | 31     |     |
| sv  |Swedish       | 11     |     |
| ta  |Tamil         | 20     |     |
| th  |Thai          |  1     |     |
| tl  |Tagalog       |  3     |     |
| tr  |Turkish       | 10     |  4  |
| uk  |Ukrainian     |  5     |     |
| uz  |Uzbek         |  5     |     |
| vi  |Vietnamese    |  1     |     |
|zh-cn|Chinese       | 58     |  5  |
|     |Simplified    |        |     |
|zh-tw|Chinese       | 55     |     |
|     |Traditional   |        |     |

Total translations:  1650       73      (~4% migrated)

Pink elephants

The plan is to develop a web-based statistics facility much like

For those who can't use a computer without a clicking device
(a.k.a. mouse) and a web browser, we will develop a web frontend to
GNUN, where one can directly edit and submit a translation (or part of
it).  Hopefully this would help (or not...) new volunteers who are
having trouble working on the GNU/Linux console.

As "web-based" and "convenience" are two totally incompatible and
contradicting things for me, I expect people who care about this to
help with the development, or at least describe in detail what they
expect from such an interface.

RMS promised a dedicated machine for this, if proved necessary.

Texinfo conversion

This is currently just an idea, and bringing it into existence will
surely be hard due to some historic limitations in Texinfo and TeX.
Also, it would require more discipline from translators when editing
the PO files -- much of what is now recommended will become mandatory.

Once we ensure that translations are automatically kept in sync and
their HTML forms follow exactly the English markup, it is possible (at
least in theory) to write new rules for conversion to Texinfo, and
generation of selected essays in Info and PDF (plus PostScript, etc.)
formats.  The printed essays would be invaluable for speeches and
various free software events.  Even for languages not supported by
Computer Modern, having the essays info-browsable both in English and
translated on your GNU system would be an excellent thing.  Some
entirely free distros may even package them and install the packages
by default -- that would be awesome.


If you have any questions, suggestions or discover a bug, feel free to
send them to <address@hidden> or file a bug at the Savannah
project `trans-coord', category `-- GNUnited Nations --'.  Thanks in

Of course, if clarifications are necessary, you can just reply to this

Feel free to bounce/forward this message to any member of your
translation team.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]