|
From: | Paul Eggert |
Subject: | bug#32472: sort doesn't sort and uniq loses data for many non-Latin scripts on UTF-8 locales |
Date: | Sat, 18 Aug 2018 10:34:31 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
Vaayda Yaasra wrote:
Here’s an example in Syriac: ܡܠܬܐ ܒܝܬܐ ܒܪܢܫܐ ܡܠܬܐ Sort produces the following: ܡܠܬܐ ܒܝܬܐ ܡܠܬܐ ܒܪܢܫܐ
This is a property of your locale, so I suggest sending a bug report to whoever maintains your locale. You should be able to reproduce the problem by bypassing GNU 'sort' entirely and using the C strcoll function.
For what it's worth, I observe the problem on Ubuntu 18.04 but not on Fedora 28. As Fedora tends to be more up-to-date, perhaps the problem is fixed already in glibc.
[Prev in Thread] | Current Thread | [Next in Thread] |