[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8
From: |
Paul Eggert |
Subject: |
Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8 |
Date: |
Sun, 16 Sep 2012 10:49:35 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:15.0) Gecko/20120827 Thunderbird/15.0 |
On 09/16/2012 07:46 AM, Max Horn wrote:
> so, any news on this?
No, and it's been some time, so I pushed the July patch, as follows.
Thanks for following up on this.
localcharset: work around Mac OS X bug with UTF-8 and MB_CUR_MAX
* lib/localcharset.c (locale_charset) [DARWIN7]:
Return "ASCII" if the system reports "UTF-8" and MB_CUR_MAX <= 1,
as these two values are incompatible. Problem reported by Max Horn.
For more discussion, please see
<http://lists.gnu.org/archive/html/bug-gnulib/2012-09/msg00061.html>.
diff --git a/lib/localcharset.c b/lib/localcharset.c
index 54a2432..1a94042 100644
--- a/lib/localcharset.c
+++ b/lib/localcharset.c
@@ -542,5 +542,12 @@ locale_charset (void)
if (codeset[0] == '\0')
codeset = "ASCII";
+#ifdef DARWIN7
+ /* Mac OS X sets MB_CUR_MAX to 1 when LC_ALL=C, and "UTF-8"
+ (the default codeset) does not work when MB_CUR_MAX is 1. */
+ if (strcmp (codeset, "UTF-8") == 0 && MB_CUR_MAX <= 1)
+ codeset = "ASCII";
+#endif
+
return codeset;
}
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/09/06
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paul Eggert, 2012/09/06
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Jim Meyering, 2012/09/06
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/09/06
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paul Eggert, 2012/09/06
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/09/07
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/09/16
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8,
Paul Eggert <=