bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gettext] picking strings to translate from a program's output


From: Egmont Koblinger
Subject: Re: [bug-gettext] picking strings to translate from a program's output
Date: Thu, 2 May 2019 12:27:28 +0200

Hi Bruno,

> The "hyperlinks in terminal emulators" [1] idea that you have started [...]
> innovation of the decade!

Thanks for your kind words; but to be fair, it wasn't me initially
throwing in the idea :)


> With the idea of conveying a user action through a URL, embedded in the string
> through an escape sequence, it will be possible to implement this approach
>   - for programs with text output as well,
>   - without pervasive changes to the program's code.

I know it's generally an unsolved issue how to translate while using
the application, without pervasive changes to the code.

Your use case for the hyperlinks for translating terminal-based apps
is a great idea, one that I haven't thought of, although I'm not
convinced it's generally applicable. There are several important
gotchas.

> 1. The gettext() function is overridden/modified to produce a string with
>    an escape sequence that contains an URL that specifies the PO file and
>    msgid.

A. By doing this, the length of the string changes (significantly).
This means that if a utilitty does a wcswidth() (or even worse:
strlen()) to measure the width occupied in the terminal and perform
alignment/indentation (e.g. table layout; padding to the terminal's
right edge...) based on that, it won't work properly.

Maybe wcswidth() can also be overridden to skip such escape sequences,
but if the utility manually sums up the per-character wcwidth()s or
uses some other similar method, it'll still fail.

B. Care has to be taken that input sanitization (e.g. removal of
escape sequneces) happens before applying the hyperlink. E.g. a
pseudo-code like

    printf(_("Cannot remove file %s\n"), filename);

might mess up the terminal if filename contains escape sequences.
Sanitizing it like

    printf(sanitize(_("Cannot remove file %s\n", filename));

would wipe out the hyperlinks as well. Or sanitizing like

    printf(_("Cannot remove file %s\n"), sanitize(filename));

is okay for hyperlinkifying for the terminal; however, then you cannot
refactor your code in a way that constructing the message, and
deciding what to do with the message (print to terminal vs. send to
logs vs. display on graphical toolkit etc.) are done at two different
places – which is possibly how the already existing code looks like.

C. By the strings becoming noticeably larger, there's a risk that you
trigger a buffer overflow at some temporary fix-sized buffer.

D. If a gettext'ed string is embedded in another gettext'ed string
(which is probably a bad practice, but sure happens sometimes), the
hyperlink isn't restored for the trailing segment of the string. This
is because of a difference between HTML's DOM tree model vs. the
terminal emulator's state machine: a terminating OSC 8 doesn't restore
the previous value but switches to non-hyperlink. Example:

    printf(_("Give me a %s please"), foo ? _("apple") : _("banana"));

"Give me a " will be a hyperlink for translating the template string,
the fruit will be a link for translating the fruit, but " please" will
be a non-link. This is especially a problem if the template string
begins with a placeholder. (We might think about extending the
protocol to have push/pop, or someone can just alter a terminal
emulator's behavior against the spec for the purpose of translating
only: non-empty URLs would automatically be pushed, and an empty one
would pop instead of setting non-hyperlink.)

E. The entire approach is unsuitable if you have any existing screen
handling library in place, such as ncurses, slang, newt... or some
manual screen handling code implemented by the app. Use of this idea
is pretty much limited to apps that produce output on their
stdout/stderr, probably with some basic terminal handling like
coloring, but without advanced terminal handling such as cursor
positioning, overwriting existing text etc.


> This way, a good portion of the strings of a program can be translated with
> context, and the "linear" approach without context is limited to messages 
> which
> are hard to produce.

Do you have concrete terminal-based apps in your mind that you'd
prefer to be translatable this way? I'm wondering if it's really worth
it to build up the said infrastructure (with webservers etc.) to
provide an alternate workflow compared to the "linear" approach plus
testing. Is there a sufficiently large set of tools to be translated +
translators willing to give this new workflow a try? I really don't
know.

What are your thoughs on these?


cheers,
egmont



reply via email to

[Prev in Thread] Current Thread [Next in Thread]