[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re-2: uniq works not correct

From: Eric Blake
Subject: Re: Re-2: uniq works not correct
Date: Mon, 08 Sep 2008 17:48:04 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20080708 Thunderbird/ Mnenhy/

Hash: SHA1

[Please keep the list posted on replies, so that others may chime in]

According to address@hidden on 9/8/2008 1:27 PM:
> Hallo
> Thanks for your answer.
> In my first mail is attach a file "test".
> To arrive at the conclusion, I`ve the same file by sort twice used.
> there's no way around this
> type the following command:
> sort test test | uniq -u |wc -l
> The reslut should to be equal "0"

And I got that result, because I did 'export LC_ALL=C' (bash notation; or
'setenv LC_ALL C' for csh notation) before running the experiment.  I
could not reproduce your failure, which is almost certainly due to your
current locale settings.

But in looking closer at your report, I noticed that uniq uses xmemcoll (a
wrapper around strcoll) rather than strcmp when determining whether lines
are equal.  So, it looks like uniq is SUPPOSED to recognize lines with
different byte contents but equal collation values as identical, but that
it failed to do so in your case.  It would be very informative for us to
know which locale you were running when you saw unexpected results;
perhaps there is a bug after all, where sort's use of xmemcoll and uniq's
use of xmemcoll are not lining up, to the point where uniq is not properly
filtering lines that sort treated as identical.  Please show us the output
of running 'locale' on your SUSE11.0 box.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


reply via email to

[Prev in Thread] Current Thread [Next in Thread]