bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gettext] Further about --kde mode for xgettext


From: Chusslove Illich
Subject: Re: [bug-gettext] Further about --kde mode for xgettext
Date: Tue, 13 Jan 2015 11:33:05 +0100
User-agent: KMail/1.13.7 (Linux/3.0.13-0.27-default; KDE/4.10.3; x86_64; ; )

> [: Daiki Ueno :]
> $ git clone git://anongit.kde.org/ki18n
>
> Is this the right place to start with?

Yes.

>> [: Chusslove Illich :]
>> because if the translation were to contain %1 (e.g. left in as unfuzzing
>> error), the runtime would try to replace it with an argument. So, to be
>> technically correct, xgettext with --kde should not try to test strings
>> for KDE 4 format, but blindly add kde-format to all of them.
>
> [: Daiki Ueno :]
> I wonder if this is really useful. If a placeholder were removed from the
> string, wouldn't the argument also be omitted from the i18n(...) call and
> cause unwanted output e.g. "File (null) not found."?

Yes, that is what I meant. Programmer removes the placeholder and the
argument, then translator unfuzzies the string wrongly, msgfmt -c no longer
checks it since it hasn't kde-format flag, and an equivalent unwanted output
happens with translation.

This may sound as a more general argument, that could apply to all formats.
But the distinction is technical in that Ki18n's i18n* calls always
interprets placeholders, even if there are no arguments. As opposed to the
case where placeholder-interpreting call is used only if there are some
arguments (e.g. printf vs. puts, or Qt's selective use of QString::arg()
method).

>> A new specialty in KF5 is that there is another set of translation calls
>> which is guaranteed to contain well-formed XML strings. [...] would be
>> good if they get another flag, e.g. kde-xml-format. [...]
>
> I remember there were related discussions before, talking about a more
> generic xml-format tag. I agree that it would be a useful feature, though
> the msgfmt -c implementation wouldn't be straightforward; maybe one would
> need to consider encodings, entity references, etc.

Yes, "more generic" is a big step, with all sorts of troubles. That is why
for kde-xml-format I propose that msgfmt -c at first does nothing beyond
exactly the same as for kde-format. To clarify, kde-xml-format is nearly a
superset of kde-format, so such strings are expected:

  #, kde-xml-format
  msgid "File <filename>%1</filename> not found."

Therefore msgfmt -c should still check placeholder matching. Markup-specific
checks could be performed by other tools (which already exists).

As a second step ("when someone gets to it"), msgfmt -c could additionally
check well-formedness only. That would catch 90%-95% of markup-related
errors in practice.

Here it is even more important that every message from an xi18n* call gets
the kde-xml-format flag, whether it has any placeholders or markup or not.
For example, if the translation should contain less-then character which is
not present in the original, it must be escaped as &lt;. If it were not, it
would be caught by whatever validation tool considers the kde-xml-format
flag.

> That would make sense, if the current kde-format is useless and
> kde-xml-format is really specific to KDE.

kde-format and kde-xml-format are related as mentioned above, so both should
be there.

I would say kde-xml-format is really specific to KDE, in several ways. This
depends on how one would handle "generic XML", but here is a very basic
specificity: literal ampersand needs to be escaped as &amp; only if in
position where it could be interpreted as start of entity reference. This is
due to Qt's use of ampersand as shortcut accelerator marker, to avoid
escaping it all the time.

>> [With] the --kde option [...] that the user does not have to specify
>> many -k and --flag options just to support all the default calls (32 in
>> total [...]
>
> Couldn't those options be added to the KDE's CMake rules? That would allow
> users to tweak options when a new keyword is added to KDE.

An analogous solution to this is used currently (not in CMake rules, but
through some repository-side automation). But I feel that there should be no
dependency on the exact build system, and that it would be most elegant if
xgettext handles defaults since it has the --kde option. In particular, at
present it makes no sense to use --kde without any -k options, as this would
extract Gettext default calls, which are not usable[*]. Of course, instead I
can simply provide in Ki18n's distribution the template command line for
xgettext invocation.

[*] This complete thread is a consequence of the fact that Ki18n calls
couple translation fetching and post-processing (argument formatting, etc).
This is unlike the normal uncoupled approach, where raw translated string is
delivered by a translation call, and post-processing performed afterwards by
another facility.

-- 
Chusslove Illich (Часлав Илић)

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]