[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] vl: set LC_CTYPE early in main() for all cod
From: |
Peter Maydell |
Subject: |
Re: [Qemu-devel] [PATCH v2] vl: set LC_CTYPE early in main() for all code |
Date: |
Mon, 15 Apr 2019 15:25:00 +0100 |
On Mon, 15 Apr 2019 at 15:17, Daniel P. Berrangé <address@hidden> wrote:
>
> Localization is not a feature whose impact is limited to the UI
> frontends. Other parts of QEMU rely in localization. In particular the
> USB MTP driver needs to be able to convert filenames from the locale
> specified character set into UTF-16 / UCS-2 encoded wide characters.
> setlocale() is only set from two of the UI frontends though, and worse,
> there is inconsistent behaviour with GTK setting LC_CTYPE to C.UTF-8,
> while ncurses honours whatever is set in the user's environment.
>
> This causes the USP MTP driver to behave differently depending on which
> UI frontend is activated.
>
> Furthermore, the curses settings are dangerous because LC_CTYPE will affect
> the is{upper,lower,alnum} functions which much QEMU code assumes to have
> C locale sorting behaviour. This also breaks QMP if the env requests a
> non-UTF-8 locale, since QMP is defined to use UTF-8 encoding for JSON.
> This problematic curses code was introduced in:
>
> commit 2f8b7cd587558944532f587abb5203ce54badba9
> Author: Samuel Thibault <address@hidden>
> Date: Mon Mar 11 14:51:27 2019 +0100
>
> curses: add option to specify VGA font encoding
>
> This patch moves the GTK frontend setlocale() handling into the main()
> method. This ensures QMP and other QEMU code has a predictable C.UTF-8.
>
> Eventually QEMU should set LC_ALL, honouring the full user environment,
> but this needs various cleanups in QEMU code first. Hardcoding LC_CTYPE
> to C.UTF-8 is a partial regression vs the above curses commit, since it
> will break the curses wide character handling for non-UTF-8 locales but
> this is unavoidable until QEMU is cleaned up to cope with non-UTF-8
> locales fully.
>
> Setting of LC_MESSAGES is left in the GTK code since only the GTK
> frontend is using translation of strings. This lets us avoid the
> platform portability problem where LC_MESSAGES is not provided by
> locale.h on MinGW. GTK pulls it in indirectly from libintl.h via
> gi18n.h header, but we don't want to pull that into the global
> QEMU namespace.
>
> Signed-off-by: Daniel P. Berrangé <address@hidden>
A few typo nits below...
>
> + /*
> + * Ideally we would set LC_ALL, but QEMU currently isn't able to cope
> + * with arbitrary localization settings. In particular there are two
> + * known problems
> + *
> + * - The QMP monitor needs to use the C locale rules for numeric
> + * formatting. This would need a double/int -> string formatter
> + * that is locale independant.
"independent"
> + *
> + * - The QMP monitor needs to encode all data as UTF-8. This needs
> + * to be updated to use iconv(3) to explicitly convert the current
> + * locale's charset into utf-8
> + *
> + * - Lots of codes uses is{upper,lower,alnum,...} functions, expecting
"code"
> + * C locale sorting behaviour. Most QEMU usage should likely be
> + * changed to g_ascii_is{upper,lower,alnum...} to match code
> + * assumptions, without being broken by locale settnigs.
"settings"
> + *
> + * We do still have two requirements
> + *
> + * - Ability to correct display translated text according to the
> + * user's locale
> + *
> + * - Ability to handle multibyte characters, ideally according to
> + * user's locale specified character set. This affects ability
> + * of usb-mtp to correctly convert filenames to UCS16 and curses
> + * & GTK frontends wide character display.
> + *
> + * The second requirement would need LC_CTYPE to be honoured, but
> + * this conflicts with the 2nd & 3rd problems listed earlier. For
> + * now we make a tradeoff, trying to set an explicit UTF-8 localee
"locale"
> + *
> + * Note we can't set LC_MESSAGES here, since mingw doesn't define
> + * this constant in locale.h Fortunately we only need it for the
> + * GTK frontend and that uses gi18n.h which pulls in a definition
> + * of LC_MESSAGES.
> + */
thanks
-- PMM