bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] maint: set LANG=C instead of LC_ALL=C


From: Eric Blake
Subject: Re: [PATCH] maint: set LANG=C instead of LC_ALL=C
Date: Thu, 10 Aug 2017 20:22:42 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 08/10/2017 05:37 PM, Bruno Haible wrote:
>> You still want a sane fallback for all the categories that you are not
>> explicitly setting.  It's harder to type:
>>
>> LANG=C LC_CTYPE=C.UTF-8 env -u LC_ALL foo
> 
> or:
>   LANG=C LC_CTYPE=C.UTF-8 LC_ALL= foo

If setting LC_ALL to empty forces fallback, then that is indeed easier
than explicitly unsetting it.  Still, if we are advocating mixed locale
execution, we MUST ensure sane defaults for ALL of the LC_* variables.
So even if you are advocating for keeping LC_ALL set, we should STILL
sanitize LANG and all the other LC_* variables (either unset them, or
set them to C).  The above command line will not work if LC_MESSAGES is
still set to some other locale, particularly one not encoded in UTF-8.

> In the big picture, I find the current situation more maintainable and robust:
> Everyone knows that when specifying a "mixed" locale they need to set all
> of LC_ALL, LC_<category>, and LANG.

Not just the LC_<category> they are overriding, but every other
LC_<category> as well.

> 
> Whereas in the world you are depicting, you would be doing half of a
> mixed locale specification and depending on maint.mk to do the other half.
> Call it "dependency" or call it "distributed responsibility" - in any
> case it would require coordination and reduce the freedom of action
> for the maintainers of maint.mk.

But the command lines with the distributed responsibility are longer, so
having sane defaults IS worth having.  Maybe you can still argue that
the sane default should include LC_ALL=C, but that does NOT mean that
the sane default can overlook the other tiers.

> 
> In particular, you don't even have a guarantee that LANG=C and 
> LC_CTYPE=C.UTF-8
> fit well together. (It might work on some libcs and fail on others.)

We already have problems with Python refusing to import UTF-8 data in
LC_ALL=C environments (which is arguably a bug in python, since POSIX
says locale "C" is 8-bit clean and therefore cannot cause encoding
errors).  C.UTF-8 does not exist everywhere, but does appear to shut up
the python problem when mixed with LANG=C, at least on the platforms
where the problems are encountered in the first place.

> With your
> proposal, we don't have a clear responsibility: it's not clear whether
> the setting of LANG=C by maint.mk or the setting of LC_CTYPE=C.UTF-8 is to
> be changed.

I think maint.mk should still default to the C locale, but make it easy
to do a mixed-locale override.  With LANG set and all other tiers
cleared, mixed-locale overrides are easy; but with LC_ALL set, an
override has to worry about all three tiers.

> 
> Which is why I prefer the current situation, without mixed or intertwined
> responsibilities.
> 
> Bruno
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]