gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] qsort() causes platform dependence


From: Gunnar Farneback
Subject: [gnugo-devel] qsort() causes platform dependence
Date: Wed, 19 Dec 2001 20:26:12 +0100
User-agent: EMH/1.14.1 SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.3 Emacs/20.7 (sparc-sun-solaris2.7) (with unibyte mode)

Quicksort is a sorting algorithm which has the property of being
unstable, i.e. the relative ordering of elements with the same
comparison value may change. Exactly how the ordering changes depends
on implementation specific details like the strategy for choosing the
pivot element.

GNU Go uses quicksort in a few places by calling the C library
function qsort(). Depending on what C library is used, the exact
implementation and behavior of this function may vary.

In particular qsort() is called from mkpat.c to sort the elements of
struct patval for each pattern. A reordering here has the potential to
affect the matching order in the pattern matching, which in turn may
cause varying results for modules using pattern matching, such as the
owl reading and the new atari_atari code. Sometimes this variation
indicates a bug somewhere but it may as well be valid variations due
to the designs and limitations of the algorithms.

One example where this shows up is arb:101, which gives different
results on Solaris (Sun's libc) and Linux (GNU libc). This test case
is also one of those giving a deviation in Trevor's regression view
compared to Dan's regression results.

In any case this platform dependency is an unnecessary complication,
which may severely obstruct tracking down of real bugs. Therefore I
propose that we include a qsort() compatible sort function in the
distribution for use everywhere in the engine.

My first thought was to use the quicksort implementation from GNU
libc, but it seems to rely on some gcc-specific C extensions and be
quite complex. Instead I have implemented a combsort, which, although
being somewhat unknown, gives a good tradeoff between speed and code
complexity. Testing shows that it is actually faster than the glibc
qsort() function on small data sets and within a factor of two slower
for large random data sets. Its performance does not degenerate for
common special cases (i.e. sorted or reversed data) but it seems to be
susceptible to O(N^2) behavior for repetitive data with specific cycle
lengths (which depend on the total size). This is very unlikely to be
a problem for GNU Go.

Hopefully this patch will solve all the platform dependencies we've
been observing recently.

- new function gg_sort() in gg_utils.c
- all calls to qsort() replaced by calls to gg_sort()

/Gunnar

Index: engine/matchpat.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/matchpat.c,v
retrieving revision 1.17
diff -u -r1.17 matchpat.c
--- engine/matchpat.c   15 Dec 2001 14:35:26 -0000      1.17
+++ engine/matchpat.c   19 Dec 2001 16:56:36 -0000
@@ -31,6 +31,7 @@
 #endif
 
 #include "liberty.h"
+#include "gg_utils.h"
 #include "patterns.h"
 #include "dfa.h"
 
@@ -854,7 +855,7 @@
   /* Sorting patterns keep the same order as 
    * standard pattern matching algorithm */
 #if DFA_SORT
-  qsort(reorder, maxr, sizeof(int), compare_int);
+  gg_sort(reorder, maxr, sizeof(int), compare_int);
 #endif /* DFA_SORT */
 
 
Index: engine/move_reasons.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/engine/move_reasons.c,v
retrieving revision 1.50
diff -u -r1.50 move_reasons.c
--- engine/move_reasons.c       17 Dec 2001 19:53:11 -0000      1.50
+++ engine/move_reasons.c       19 Dec 2001 16:56:37 -0000
@@ -27,6 +27,7 @@
 #include <math.h>
 
 #include "liberty.h"
+#include "gg_utils.h"
 #include "random.h"
 #include "move_reasons.h"
 
@@ -3560,9 +3561,9 @@
     num_reasons = 0;
     while (move[pos].reason[num_reasons] >= 0)
       num_reasons++;
-    qsort(&(move[pos].reason[0]), num_reasons, 
-         sizeof(move[pos].reason[0]),
-         compare_move_reasons);
+    gg_sort(&(move[pos].reason[0]), num_reasons, 
+           sizeof(move[pos].reason[0]),
+           compare_move_reasons);
 
     /* Estimate the value of various aspects of the move. The order
      * is significant. Territorial value must be computed before
Index: patterns/extract_fuseki.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/patterns/extract_fuseki.c,v
retrieving revision 1.10
diff -u -r1.10 extract_fuseki.c
--- patterns/extract_fuseki.c   15 Dec 2001 14:35:26 -0000      1.10
+++ patterns/extract_fuseki.c   19 Dec 2001 16:56:37 -0000
@@ -71,6 +71,7 @@
 #include <limits.h>
 #include "../sgf/sgftree.h"
 #include "liberty.h"
+#include "gg_utils.h"
 #include "random.h"
 
 #define USAGE "\n\
@@ -349,7 +350,7 @@
 {
   common_hash_board(hash, color_to_play);
   /* Sort the 8 hash values. */
-  qsort(hash->values, 8, sizeof(hash->values[0]), compare_numbers);
+  gg_sort(hash->values, 8, sizeof(hash->values[0]), compare_numbers);
 }
 
 /* Compute invariant hash for the current situation, i.e. position
@@ -369,7 +370,7 @@
   /* Notice that we of course must wait with sorting until we have
    * added the move to the hash values.
    */
-  qsort(hash->values, 8, sizeof(hash->values[0]), compare_numbers);
+  gg_sort(hash->values, 8, sizeof(hash->values[0]), compare_numbers);
 }
 
 
@@ -906,8 +907,8 @@
 {
   int k;
   /* Sort all the collected situations. */
-  qsort(situation_table, number_of_situations, sizeof(*situation_table),
-       compare_situations);
+  gg_sort(situation_table, number_of_situations, sizeof(*situation_table),
+         compare_situations);
   
   /* Debug output. */
   if (0) {
@@ -943,8 +944,8 @@
   }
   
   /* Sort the frequency table, in falling order. */
-  qsort(frequency_table, number_of_distinct_positions,
-       sizeof(*frequency_table), compare_frequencies);
+  gg_sort(frequency_table, number_of_distinct_positions,
+         sizeof(*frequency_table), compare_frequencies);
   
   /* Debug output. */
   if (0) {
@@ -991,8 +992,8 @@
     }
     
     /* Sort the moves, in falling order. */
-    qsort(move_frequencies, number_of_moves,
-         sizeof(*move_frequencies), compare_frequencies);
+    gg_sort(move_frequencies, number_of_moves,
+           sizeof(*move_frequencies), compare_frequencies);
     
     /* Debug output. */
     if (0) {
Index: patterns/mkpat.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/patterns/mkpat.c,v
retrieving revision 1.34
diff -u -r1.34 mkpat.c
--- patterns/mkpat.c    15 Dec 2001 14:35:27 -0000      1.34
+++ patterns/mkpat.c    19 Dec 2001 16:56:37 -0000
@@ -69,6 +69,7 @@
 
 #include "patterns.h"
 #include "gg-getopt.h"
+#include "gg_utils.h"
 
 #include "dfa.h"
 
@@ -1259,7 +1260,7 @@
   assert(ci != -1 && cj != -1);
 
   /* sort the elements so that least-likely elements are tested first. */
-  qsort(elements, el, sizeof(struct patval), compare_elements);
+  gg_sort(elements, el, sizeof(struct patval), compare_elements);
 
   fprintf(outfile, "static struct patval %s%d[] = {\n", name, patno);
 
Index: utils/gg_utils.c
===================================================================
RCS file: /cvsroot/gnugo/gnugo/utils/gg_utils.c,v
retrieving revision 1.17
diff -u -r1.17 gg_utils.c
--- utils/gg_utils.c    15 Dec 2001 14:35:27 -0000      1.17
+++ utils/gg_utils.c    19 Dec 2001 16:56:38 -0000
@@ -299,6 +299,53 @@
 #endif
 }
 
+/* A sorting algorithm, call-compatible with the libc qsort() function.
+ *
+ * The reason to prefer this to standard qsort() is that quicksort is
+ * an unstable sorting algorithm, i.e. the relative ordering of
+ * elements with the same comparison value may change. Exactly how the
+ * ordering changes depends on implementation specific details like
+ * the strategy for choosing the pivot element. Thus a list with
+ * "equal" values may be sorted differently between platforms, which
+ * potentially can lead to significant differences in the move engine.
+ *
+ * This is an implementation of the combsort algorithm.
+ *
+ * Testing shows that it is faster than the GNU libc qsort() function
+ * on small data sets and within a factor of two slower for large
+ * random data sets. Its performance does not degenerate for common
+ * special cases (i.e. sorted or reversed data) but it seems to be
+ * susceptible to O(N^2) behavior for repetitive data with specific
+ * cycle lengths.
+ */
+void
+gg_sort(void *base, size_t nel, size_t width,
+       int (*cmp)(const void *, const void *))
+{
+  int gap = nel;
+  int swap_made;
+  char *end = (char *) base + width * (nel - 1);
+  do {
+    char *a, *b;
+    swap_made = 0;
+    gap = (10 * gap + 3) / 13;
+    for (a = base, b = a + gap * width; b <= end; a += width, b += width) {
+      if (cmp((void *) a, (void *) b) > 0) {
+       char *c = a;
+       char *d = b;
+       size_t size = width;
+       while (size-- > 0) {
+         char tmp = *c;
+         *c++ = *d;
+         *d++ = tmp;
+       }
+       swap_made = 1;
+      }
+    }
+  } while (gap > 1 || swap_made);
+}
+
+
 /* Reorientation of point (i, j) into (*ri, *rj) */
 void
 rotate(int i, int j, int *ri, int *rj, int bs, int rot)
Index: utils/gg_utils.h
===================================================================
RCS file: /cvsroot/gnugo/gnugo/utils/gg_utils.h,v
retrieving revision 1.6
diff -u -r1.6 gg_utils.h
--- utils/gg_utils.h    20 Oct 2001 18:25:31 -0000      1.6
+++ utils/gg_utils.h    19 Dec 2001 16:56:38 -0000
@@ -63,6 +63,10 @@
 double gg_gettimeofday(void);
 double gg_cputime(void);
 
+void gg_sort(void *base, size_t nel, size_t width,
+            int (*compar)(const void *, const void *));
+
+
 const char *gg_version(void);
 
 /* prototypes for basic reorientation functions */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]