[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] quotearg: use Unicode quotes in UTF-8 locales when no tr

From: Bruno Haible
Subject: Re: [PATCH 3/4] quotearg: use Unicode quotes in UTF-8 locales when no translation is available
Date: Tue, 20 Dec 2011 00:57:56 +0100
User-agent: KMail/1.13.6 (Linux/; KDE/4.6.0; x86_64; ; )

Hi Paolo,

Thanks for these improvements.

> @@ -188,8 +189,21 @@ static char const *
>  gettext_quote (char const *msgid, enum quoting_style s)
>  {
>    char const *translation = _(msgid);
> -  if (translation == msgid && s == clocale_quoting_style)
> -    translation = "\"";
> +  char const *locale_code;
> +
> +  if (translation != msgid)
> +    return translation;
> +
> +  assert (s == clocale_quoting_style || s == locale_quoting_style);

In gnulib, we avoid 'assert' because we hate to ship binaries that are
less reliable than those that we tested on our development machines.

In this case, the assertion can be put in a comment: Since the function
is 'static', the risk that a future refactoring would pass a different
value for s is small.

> +  /* For UTF-8, use single quotes.  */

It would be good to mention U+2018 and U+2019 here, in the comments.

> +  locale_code = locale_charset ();
> +  if (STRCASEEQ (locale_code, "UTF-8", 'U','T','F','-','8',0,0,0,0))
> +    return msgid[0] == '`' ? "\xe2\x80\x98": "\xe2\x80\x99";

UTF-8 is not the only encoding that contains the character U+2018. Others are:

ISO-8859-7   0xA1
KOI8-T       0x91
CP869        0x8B
CP874        0x91
CP932        0x81 0x65
CP936        0xA1 0xAE
CP949        0xA1 0xAE
CP950        0xA1 0xA5
CP1250       0x91
CP1251       0x91
CP1252       0x91
CP1253       0x91
CP1254       0x91
CP1255       0x91
CP1256       0x91
CP1257       0x91
EUC-JP       0xA1 0xC6
EUC-KR       0xA1 0xAE
EUC-TW       0xA1 0xE4
BIG5         0xA1 0xA5
BIG5-HKSCS   0xA1 0xA5
EUC-CN       0xA1 0xAE
GBK          0xA1 0xAE
GB18030      0xA1 0xAE
Georgian-PS  0x91
PT154        0x91

Among these, only the GB18030 is still important as a locale encoding nowadays.
So, please add two more lines:

  if (STRCASEEQ (locale_code, "GB18030", 'G','B','1','8','0','3','0',0,0))
    return msgid[0] == '`' ? "\xa1\ae": "\xa1\xaf";


reply via email to

[Prev in Thread] Current Thread [Next in Thread]