emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Translation of manuals (was: SES manual French translation)


From: Jean-Christophe Helary
Subject: Re: Translation of manuals (was: SES manual French translation)
Date: Tue, 02 Jan 2024 13:16:53 +0000

>>> If and when you decide on doing that, I'm guessing that we will need
>>> to keep the original "source" files for the translations in the Git
>>> repository, and find a suitable way of producing *.texi from them.
>>> I'm not sure po4a is the best alternative; we should probably consider
>>> others as well and find what is best for us.
>> 
>> Do you have other alternatives in mind?
> 
> Not at the moment, no.  And I'm not even sure this is our
> Emacs-specific problem to solve.  If the GNU Project is going to
> support translations of the manuals, the ways of doing that should be
> discussed project-wide, probably on then Texinfo mailing list or at
> least with the participation of the Texinfo developers.

The GNU project is already supporting translation of documentation.

https://translationproject.org/extra/matrix.html

(TexInfo is there too)

>> If we were not to use an intermediate format and directly use texi,
>> we’d need a translation reference format, like TMX, but it could be
>> translated PO files, since they share a similar structure. The
>> translators would translate chunk by chunk and need a tool that
>> identifies already translated chunks from the reference format and use
>> them instead.
> 
> We could for starters invent some simple technique of our own, like
> markers within the Texinfo sources that identify small enough chunks
> of text, or something similar.

Markers are already here (TexInfo commands, end of sentence markers, 
etc.).

Some commands in TexInfo have translatable contents, others do not.
 
There are segmentation rule sets that are shared in the “industry” 
that have splitting rules and exception rules based on the source 
language (like “in English split on a period followed by a space” with 
exceptions such as “not after M. ”, etc.) Rules are based on regular 
expressions.

A rule is basically a set of “before the split” / “after the split” 
regex. Exceptions have a similar structure. They are treated first and 
“lock” a potential split. Then splitting rules are applied. You end up 
with a list of “segments”.


> And again: the rate of change of the misc manuals is quite low.  If
> and when someone will decide to translate the Emacs user manual or the
> ELisp reference manual, then we'll have a serious problem with keeping
> up with the rate of changes.  But not before that.

I’m sorry I did not make myself clear. *I* for one am working on it. 
There are about 2 million words in the various manuals and I’ve 
translated (or recycled) about 130,000. I’m focusing on the Emacs 
manual at the moment, and I sometimes do a few paragraphs in the Lisp 
reference or the introduction. I have barely touched the miscellaneous 
manuals.

The process I’m using is based on converting texi files to PO and then 
to translate the PO files with appropriate “computer aided translation” 
software.

I’m totally fine with the emacs project taking a totally different 
approach, but it’s not like free software translation teams have not 
already worked for decades on tested processes.

Honestly, the easiest way to handle the translation of Emacs documents 
would be to contact the TP project manager and to discuss what’s the 
best way to do that, while thinking of what kind of infrastructure we 
want on the emacs side to handle how to store/build all the translations.


-- 
Jean-Christophe Helary @jchelary@emacs.ch
https://traductaire-libre.org
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]