coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

maint: skip a check when en_US.UTF-8 collation rules are broken


From: Jim Meyering
Subject: maint: skip a check when en_US.UTF-8 collation rules are broken
Date: Mon, 25 Jul 2016 08:48:18 -0700

On OS X, *.UTF-8 locales use ASCII collating rules(!?):

  $ readlink /usr/share/locale/*.UTF-8/LC_COLLATE|sort -u
  ../la_LN.US-ASCII/LC_COLLATE

This means that sort, and any other program that relies on strcoll,
cannot be expected to work consistently on OS X in any UTF-8 locale.

I noticed this when sed's THANKS.in file sorted differently on OS X
than everywhere else. Here's a small C program to demonstrate the
problem. It prints -51 on OS X, yet 1 (indicating "J.b" is greater
than "Ja") on linux:

$ cat k.c
#include <string.h>
#include <stdio.h>
#include <locale.h>

int
main() {
  setlocale (LC_ALL, "");
  int d = strcoll("J.b", "Ja");
  printf ("%d\n", d);
  return 0;
}
$ gcc -Wall -W k.c && ./a.out
-51

The "-51" comes from OS X's computation of '.' - 'a'.

Attachment: sort-vs-OSX.diff
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]