[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs i18n

From: Richard Stallman
Subject: Re: Emacs i18n
Date: Fri, 08 Mar 2019 22:11:51 -0500

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Russian, for example, uses three different grammatical cases, which are
  > dependent on the last digit of the number, the system needs to be more
  > complicated.

Here's an idea for a scheme general enough to handle Russian as well.
I propose something like a case or select construct.
First, the elegant Lispy way to represent it:

  (numeric-case NUMBER
      (1 "Just one frob")
      (2 "Two frobs")
      (russian-masc "%d-m frobs")
      (russian-fem "%d-f frobs")
      (russian-neut "%d-n frobs")
      (t "%d frobs"))

Translation would have to the entire numeric-case construct
with another (translated) numeric-case construct.  Thus, the source
code would contain one suitable for English:

  (numeric-case NUMBER
      (1 "one frob")
      (t "%d frobs"))

and for Russian we would translate it into this one

  (numeric-case NUMBER
      (russian-masc "%d-m frobs")
      (russian-fem "%d-f frobs")
      (russian-neut "%d-n frobs"))

I think this framework could be extended to handle
whatever other weird grammatical rules we might encounter in other languages
in the future.

While doing it with Lisp syntax is elegant, it would require
generalization of the infrastructure for recording translations to
handle more than strings.   That would be a pain.

Here's a way to represent the conditional construct as a kind of
string.  That way, translation would only need to translate strings
into strings.

We could use | in the string to separate alternatives, and : to end
a condition.  It would look like this:

  (numeric-case NUMBER
    "1:one frob|\
     t:%d frobs")

For Russian, we would translate the source string

  1:one frob|t:%d frobs


  russian-masc:%d-m frobs|russian-fem:%d-f frobs|russian-neut:%d-n frobs

The subsequences : and | would be handled by the function numeric-case.
They would not affect the meaning of the string data type as such.
numeric-case would ignore whitespace after |.

With this string convention, we only need to translate strings.

To include a | in an alternative, you could write a double |.
We do not need a way to quote a colon.

Perhaps one could develop a smarter 'russian' alternative
that knows how to change the last letter automatically and handles
all three alternatives.

Maybe we need to define a format-spec for devouring and ignoring one argument.

Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]