bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32472: sort doesn't sort and uniq loses data for many non-Latin scri


From: Paul Eggert
Subject: bug#32472: sort doesn't sort and uniq loses data for many non-Latin scripts on UTF-8 locales
Date: Sat, 18 Aug 2018 10:34:31 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Vaayda Yaasra wrote:
Here’s an example in Syriac:

ܡܠܬܐ
ܒܝܬܐ
ܒܪܢܫܐ
ܡܠܬܐ

Sort produces the following:

ܡܠܬܐ
ܒܝܬܐ
ܡܠܬܐ
ܒܪܢܫܐ

This is a property of your locale, so I suggest sending a bug report to whoever maintains your locale. You should be able to reproduce the problem by bypassing GNU 'sort' entirely and using the C strcoll function.

For what it's worth, I observe the problem on Ubuntu 18.04 but not on Fedora 28. As Fedora tends to be more up-to-date, perhaps the problem is fixed already in glibc.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]