bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18121: A bug in sort.


From: Pádraig Brady
Subject: bug#18121: A bug in sort.
Date: Mon, 28 Jul 2014 09:41:08 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 07/28/2014 01:05 AM, Tom Bryant wrote:
> I issued a "sort -n hugeFile > sortedHugeFile" and it introduced a very 
> occasional but destructive "x" in to the data.
> 
> The original data consisted of numeric fields, separated by the vertical bar, 
> "|", and +, - and spaces.  It was 25861964610 bytes in size.
> 
> The final file had around 10 "x" characters overwritten in it.  It too was  
> 25861964610 bytes in size.  I copy the first few lines to give you an idea of 
> what sort was sorting:
> 
> 0.01996377896414875189|-1.56937596815334989842|13950|13860|9|0|0|146|158|8|6|2|9697|59367|119|65406|159|161|1101364107|12467|12131|11963|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000076|1|2|1
>  1
> 0.05686181376938173604|-1.56865877357861105423|14858|14817|7|0|0|158|160|6|6|2|9584|16962|42|65512|167|167|1229086934|12870|12167|12014|5|5|5|2|2|2|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000185|1|7|1
>  2
> 0.08867460878463592766|-1.56748967932357308186|10400|10375|2|0|0|141|140|8|8|5|9290|56797|26|36|141|139|1181024763|7516|6675|6389|5|5|5|2|3|3|13182|10986|20000|20000|20000|99|99|99|99|99|2|310000001|0|0|1000431|1|10|1
>  3
> 0.13659213373632231314|-1.56927658619685916896|14012|13924|9|0|0|151|148|8|8|2|9611|52428|153|65530|160|159|1127037907|12431|12038|11937|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000084|1|14|1
>  4
> 0.15088146914756625505|-1.57030633530367280670|16079|16329|99|5|0|223|226|1|1|1|9874|37522|0|0|127|127|1085342271|15299|14894|14657|25|25|26|7|10|13|20000|20000|20000|20000|20000|99|99|99|99|99|0|0|0|0|1000007|0|0|1
>  5
> 0.17178172876255659585|-1.56903360727616081327|13032|12989|5|0|0|148|145|8|8|2|9647|57825|126|0|157|157|1085364212|11087|10514|10353|5|5|5|2|3|3|20000|20000|20000|20000|20000|99|99|99|99|99|3|1|0|0|1000121|1|24|1
>  6
> 0.18379604688637316001|-1.56836539827576126882|15692|16287|39|0|0|195|200|2|2|2|9341|13621|65514|2|197|198|1085364149|14738|14268|13997|5|5|5|3|6|7|20000|20000|20000|20000|20000|99|99|99|99|99|4|1|0|0|1000243|0|0|1
>  7
> 
> The data, FWIW, is an ASCII representation of the UCAC4 star catalog.
> 
> Here is an example of a record with the "x" added to it by sort:
>                                                                               
>                                      V
> 2.04433377497687374102|0.22403821980488977661|16454|20000|99|1|0|23x24|1|1|2|8603|20560|141|65324|192|191|111893392|14129|13386|13099|25|2|2|4|99|99|20000|20000|20000|20000|20000|99|99|99|99|99|0|30|0|0|118728360|0|0|515
>  44588
> 
> I still have the original and flawed sort if you're interested.
> 
> The computer this error occured on was a 16Gb machine with a 2TB drive and an 
> Intel Quad core processor running Slackware Linux 13.0.

When processing large amounts of data (25G in this case),
and one sees corruptions in the content but not the size,
it's worth considering hardware errors.

This case might be indicative of single bit errors in RAM,
as the difference between '|' and 'x' is only a single bit.
I would first eliminate that possibility with a RAM checker.

Note sort uses a large memory buffer by default,
so more susceptible than most data processors to issues like this.

If you can reproduce the issue on another system, then
we can start looking at software errors.

thanks,
Pádraig.

p.s. please provide the version of sort





reply via email to

[Prev in Thread] Current Thread [Next in Thread]