bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13638: linux-sort inconsistency


From: Knud Arnbjerg Christensen
Subject: bug#13638: linux-sort inconsistency
Date: Wed, 6 Feb 2013 10:49:38 +0000

Hi
linux-sort inconsistency occours when sorting an alfpha-numeric field,
then the order becomes different depending on if the following field is numeric 
(file 1) or alfanumeric (file 2). In case one the length of the shorter fields 
is extended by ┬┤zeros┬┤ in case 2 the fields is extended by blanks which cause 
the different sorting order.

knud c

sort -k 1 file1>file1-sorted
Seq_101615 00022   x 03262 03068
Seq_101656 00001   x 03068 00470
Seq_101744 00001   x 00470 00586
Seq_10187 00001   x 00181 00553
Seq_10190 00001   x 00553 01182
Seq_101903 00001   x 00586 00331
Seq_101949 00001   x 00331 00822
Seq_10201 00001   x 01182 00396
Seq_10203 00001   x 00396 00499
Seq_10205 00001   x 00499 00603
Seq_10210 00013   x 00603 00370
Seq_1021 00001   x 00744 01203
Seq_102103 00001   x 00822 01356
Seq_102146 00001   x 01356 00303
Seq_10224 00001   x 00370 00864
Seq_10226 00001   x 00864 00205
Seq_102287 00001   x 00303 00290
Seq_102291 00001   x 00290 01632
Seq_1023 00025   x 01203 02268
Seq_102331 00001   x 01632 00204
Seq_102334 00001   x 00204 00354
Seq_102389 00001   x 00354 00303
Seq_1024 00001   x 02268 01267
Seq_102421 00001   x 00303 00281
Seq_102427 00001   x 00281 00757
Seq_10247 00001   x 00205 00406
Seq_10250 00001   x 00406 00647
Seq_102555 00001   x 00757 01351

sort -k 1 file2 >file2-sorted
Seq_101615 complete MYRIP Rab effector MyRIP   3161
Seq_101656 incomplete BFSP2 Phakinin   590
Seq_101744 incomplete CK048 Uncharacterized protein C11orf48   678
Seq_10187 incomplete B4DN50 Gap junction protein   640
Seq_101903 incomplete FAIM1 Fas apoptotic inhibitory molecule 1   416
Seq_10190 incomplete HSF2 Heat shock factor protein 2   1273
Seq_101949 incomplete TCEA3 Transcription elongation factor A protein 3   906
Seq_10201 incomplete E9PNK6 Tumor protein D52-like 1   482
Seq_102103 incomplete ATR Serine/threonine-protein kinase ATR   1456
Seq_10210 complete CENPW Centromere protein W   470
Seq_102146 incomplete E7ET15 U2 snRNP-associated SURP domain-containing   388
Seq_1021 incomplete B1AMR4 Cdc42 guanine nucleotide exchange factor (GEF) 9   
1293
Seq_10224 complete SAMD3 Sterile alpha motif domain-containing protein 3   964
Seq_10226 incomplete Q6R5J7 4.1G isoform   292
Seq_102287 incomplete CBPB1 Carboxypeptidase B   387
Seq_102291 incomplete CBPA3 Mast cell carboxypeptidase A   1721
Seq_102331 incomplete T4S1 Transmembrane 4 L6 family member 1   290
Seq_102334 incomplete F8WBG6 Transmembrane 4 L six family member 1   439
Seq_102389 incomplete C9JQ45 Profilin   388
Seq_1023 complete ELF4 ETS-related transcription factor Elf-4   2353
Seq_102421 incomplete KRR1 KRR1 small subunit processome component homolog   368
Seq_102427 incomplete MD12L Mediator of RNA polymerase II transcription subunit 
12-like protein  857
Seq_10247 incomplete ERD21 ER lumen protein retaining receptor 1   493
Seq_1024 incomplete JKIP3 Janus kinase and microtubule-interacting protein 3   
1374
Seq_10250 incomplete S35D2 UDP-N-acetylglucosamine/UDP-glucose/GDP-mannose 
transporter   740
Seq_102555 incomplete GP149 Probable G-protein coupled receptor 149   1451



reply via email to

[Prev in Thread] Current Thread [Next in Thread]