[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs i18n

From: Paul Eggert
Subject: Re: Emacs i18n
Date: Sun, 10 Mar 2019 20:52:47 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1

Richard Stallman wrote:
Compare this

   (numeric-case NUMBER
       (russian-masc "%d байт скопирован, %s, %s")
       (russian-fem "%d байта скопировано, %s, %s")
       (russian-neut "%d байт скопировано, %s, %s"))

with this:

     "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && 
     && (n%100<10 || n%100>=20) ? 1 : 2);\n"
     #: src/dd.c:822
     #, c-format
     msgid "%<PRIuMAX> byte copied, %s, %s"
     msgid_plural "%<PRIuMAX> bytes copied, %s, %s"
     msgstr[0] "%<PRIuMAX> байт скопирован, %s, %s"
     msgstr[1] "%<PRIuMAX> байта скопировано, %s, %s"
     msgstr[2] "%<PRIuMAX> байт скопировано, %s, %s"

I'm afraid that's not a apples-to-apples comparison. The first form contains only the Russian translations, whereas the second form contains much more information: the source-code location of the untranslated strings, a copy of the untranslated English-language strings, and the general rules for Russian (the last is shared among all the Russian translations, not just the translations listed here). This extra information is useful for translators, and it has a reasonably extensive software suite that already supports it, not to mention translators who are already used to it.

I can envision something like this:

       "russian-nom:%d байт%| скопирован%|, %s, %s"

where the 'russian-nom' operator would replace the two %| sequences
with the appropriate declensional suffixes for the nominative case.

But Russian declension is not that simple. The Russian word for "byte" is "байт", but its plural form depends not only on the number (as in the above examples) but also in its case: the "байт" and "байта" in the above examples are not exhaustive. And some words have irregular declensions: for example, ребёнок (singular) versus де́ти (plural) for the same noun. And it's not just nouns and pronouns that are affected: adjectives also have singular and plural forms. And I have by no means exhausted the issues involved here; to get a better feeling for the complexity in this area, please see:


Although it wouldn't be impossible for Emacs Lisp code to handle all the special cases for Russian declension, it would be tricky to implement, or to document it in a way that translators would easily understand. And we'd also have to implement and document similarly tricky rules for other languages. And we'd have to deal with the fact that not every Russian-speaker agrees with how to decline words like "байт" that are imported from English. These sorts of issues should be delegated to translators, not to likely-fragile code in Emacs Lisp (a technology that translators typically do not grok).

In contrast, the gettext way is relatively simple and easily understood, and is already common practice.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]