guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] GNU Guile branch, master, updated. release_1-9-2-147-gaa


From: Michael Gran
Subject: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-147-gaa2cba9
Date: Wed, 02 Sep 2009 13:48:20 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU Guile".

http://git.savannah.gnu.org/cgit/guile.git/commit/?id=aa2cba9c882ba8bd69750b120d2b7ccd7250b562

The branch, master has been updated
       via  aa2cba9c882ba8bd69750b120d2b7ccd7250b562 (commit)
       via  4d07801b214c83761b6669b10dc0608c3081043b (commit)
       via  08ed805879a0c5fc5231522f99c7af782ac3cd8b (commit)
       via  aff31b0f9946999d9adbf582fabbbd3e523e29e8 (commit)
       via  f4cdfe6140923b5cf11d7af9a2098514d98a61a5 (commit)
       via  bde543e88b070b5fcd2b08ba8129ec1125ab0861 (commit)
       via  91772d8f8af742e864cccf5578776f09bfecf7e9 (commit)
       via  693e72891f28a1e009bd4661711d11d8f2b0e201 (commit)
       via  7165abeba8e97358410146a36c12b2a5fa5e2463 (commit)
      from  7f7b85cbf68a8b83e1ad7bc78379cf2764fc9a1b (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit aa2cba9c882ba8bd69750b120d2b7ccd7250b562
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:45:05 2009 -0700

    Remove always-true range checks in scm_i_ucs_range_to_char_set
    
    * libguile/srfi-14.c (scm_i_ucs_range_to_char_set): limits are always
      non-negative due to the type of the variable

commit 4d07801b214c83761b6669b10dc0608c3081043b
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:26:50 2009 -0700

    More srfi-14 char-set tests
    
    * test-suite/tests/srfi-14.test: many new tests

commit 08ed805879a0c5fc5231522f99c7af782ac3cd8b
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:21:40 2009 -0700

    Unreachable code in charset set operator
    
    * libguile/srfi-14.c (scm_i_charset_set): remove unreachable code
      in scm_i_charset_set

commit aff31b0f9946999d9adbf582fabbbd3e523e29e8
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:20:45 2009 -0700

    Optimize charset union operator
    
    * libguile/srfi-14.c (charsets_union): call scm_i_charset_set_range
      instead of setting characters one-by-one.

commit f4cdfe6140923b5cf11d7af9a2098514d98a61a5
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:19:21 2009 -0700

    The charset complement operator should not include surrogates
    
    * libguile/srfi-14.c (charsets_complement): skip over surrogates
      when making a charset complement

commit bde543e88b070b5fcd2b08ba8129ec1125ab0861
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:16:35 2009 -0700

    char-set-filter! does not properly iterate over the charset
    
    * libguile/srfi-14.c (scm_char_set_filter_x): iterate over
      codepoints

commit 91772d8f8af742e864cccf5578776f09bfecf7e9
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:14:49 2009 -0700

    ucs-range->char-set should not store surrogates and has off-by-one error
    
    * libguile/srfi-14.c (scm_i_ucs_range_to_char_set): new function that
      contains the functionality of ucs_range_to_char_set, fixes
      off-by-one, and doesn't store surroges
      (scm_ucs_range_to_char_set, scm_ucs_range_to_char_set_x): call
      scm_i_ucs_range_to_char_set
      (scm_i_charset_set_range): new helper function

commit 693e72891f28a1e009bd4661711d11d8f2b0e201
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:03:28 2009 -0700

    char-set-any improperly unpacks charset data
    
    * libguile/srfi-14.c (scm_char_set_any): unpack the charset correctly

commit 7165abeba8e97358410146a36c12b2a5fa5e2463
Author: Michael Gran <address@hidden>
Date:   Wed Sep 2 06:02:14 2009 -0700

    char-set-xor! should modify the input parameter
    
    char-set-xor! was not modifying its input parameter.  It isn't
    technically required to do so by the spec, but, the other similar
    functions do it.
    
    * libguile/srfi-14.c (scm_char_set_xor_x): modify the input parameter

-----------------------------------------------------------------------

Summary of changes:
 libguile/srfi-14.c            |  281 ++++++++++++++++++++--------
 test-suite/tests/srfi-14.test |  406 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 607 insertions(+), 80 deletions(-)

diff --git a/libguile/srfi-14.c b/libguile/srfi-14.c
index 50229ef..33b508d 100644
--- a/libguile/srfi-14.c
+++ b/libguile/srfi-14.c
@@ -86,18 +86,11 @@ scm_i_charset_set (scm_t_char_set *cs, scm_t_wchar n)
           /* This char is one below the current range. */
           if (i > 0 && cs->ranges[i - 1].hi + 1 == n)
             {
-              /* It is also one above the previous range, so combine them.  */
-              cs->ranges[i - 1].hi = cs->ranges[i].hi;
-              if (i < len - 1)
-                memmove (cs->ranges + i, cs->ranges + (i + 1),
-                         sizeof (scm_t_char_range) * (len - i - 1));
-              cs->ranges = scm_gc_realloc (cs->ranges,
-                                           sizeof (scm_t_char_range) * len,
-                                           sizeof (scm_t_char_range) * (len -
-                                                                        1),
-                                           "character-set");
-              cs->len = len - 1;
-              return;
+              /* It is also one above the previous range.  */
+              /* This is an impossible condition: in the previous
+                 iteration, the test for 'one above the current range'
+                 should already have inserted the character here.  */
+              abort ();
             }
           else
             {
@@ -168,6 +161,103 @@ scm_i_charset_set (scm_t_char_set *cs, scm_t_wchar n)
   return;
 }
 
+/* Put LO to HI inclusive into charset CS.  */
+static void
+scm_i_charset_set_range (scm_t_char_set *cs, scm_t_wchar lo, scm_t_wchar hi)
+{
+  size_t i;
+
+  i = 0;
+  while (i < cs->len)
+    {
+      /* Already in this range  */
+      if (cs->ranges[i].lo <= lo && cs->ranges[i].hi >= hi)
+        return;
+
+      /* cur:       +---+
+         new: +---+
+      */
+      if (cs->ranges[i].lo - 1 > hi)
+        {
+          /* Add a new range below the current one.  */
+          cs->ranges = scm_gc_realloc (cs->ranges,
+                                       sizeof (scm_t_char_range) * cs->len,
+                                       sizeof (scm_t_char_range) * (cs->len + 
1),
+                                       "character-set");
+          memmove (cs->ranges + (i + 1), cs->ranges + i,
+                   sizeof (scm_t_char_range) * (cs->len - i));
+          cs->ranges[i].lo = lo;
+          cs->ranges[i].hi = hi;
+          cs->len += 1;
+          return;
+        }
+
+      /* cur:      +---+  or     +---+  or    +---+
+         new: +---+          +---+         +---+
+      */
+      if (cs->ranges[i].lo > lo
+          && (cs->ranges[i].lo - 1 <= hi && cs->ranges[i].hi >= hi))
+        {
+          cs->ranges[i].lo = lo;
+          return;
+        }
+
+      /* cur: +---+    or +---+     or +---+
+         new:   +---+         +---+         +---+
+      */
+      else if (cs->ranges[i].hi + 1 >= lo && cs->ranges[i].hi < hi)
+        {
+          if (cs->ranges[i].lo > lo)
+            cs->ranges[i].lo = lo;
+          if (cs->ranges[i].hi < hi)
+            cs->ranges[i].hi = hi;
+          while (i < cs->len - 1)
+            {
+              /* cur: --+    +---+
+                 new: -----+
+              */
+              if (cs->ranges[i + 1].lo - 1 > hi)
+                break;
+              
+              /* cur: --+   +---+  or  --+  +---+  or --+ +--+
+                 new: -----+           ------+        ---------+
+              */
+              /* Combine this range with the previous one.  */
+              if (cs->ranges[i + 1].hi > hi)
+                cs->ranges[i].hi = cs->ranges[i + 1].hi;
+              if (i + 1 < cs->len)
+                memmove (cs->ranges + i + 1, cs->ranges + i + 2,
+                         sizeof (scm_t_char_range) * (cs->len - i - 2));
+              cs->ranges = scm_gc_realloc (cs->ranges,
+                                           sizeof (scm_t_char_range) * cs->len,
+                                           sizeof (scm_t_char_range) * 
(cs->len - 1),
+                                           "character-set");
+              cs->len -= 1;
+            }
+          return;
+        }
+      i ++;
+    }
+
+  /* This is a new range above all previous ranges.  */
+  if (cs->len == 0)
+    {
+      cs->ranges = scm_gc_malloc (sizeof (scm_t_char_range), "character-set");
+    }
+  else
+    {
+      cs->ranges = scm_gc_realloc (cs->ranges,
+                                   sizeof (scm_t_char_range) * cs->len,
+                                   sizeof (scm_t_char_range) * (cs->len + 1),
+                                   "character-set");
+    }
+  cs->len += 1;
+  cs->ranges[cs->len - 1].lo = lo;
+  cs->ranges[cs->len - 1].hi = hi;
+
+  return;
+}
+
 /* If N is in charset CS, remove it.  */
 void
 scm_i_charset_unset (scm_t_char_set *cs, scm_t_wchar n)
@@ -302,7 +392,7 @@ static void
 charsets_union (scm_t_char_set *a, scm_t_char_set *b)
 {
   size_t i = 0;
-  scm_t_wchar blo, bhi, n;
+  scm_t_wchar blo, bhi;
 
   if (b->len == 0)
     return;
@@ -316,13 +406,11 @@ charsets_union (scm_t_char_set *a, scm_t_char_set *b)
       return;
     }
 
-  /* This needs optimization.  */
   while (i < b->len)
     {
       blo = b->ranges[i].lo;
       bhi = b->ranges[i].hi;
-      for (n = blo; n <= bhi; n++)
-        scm_i_charset_set (a, n);
+      scm_i_charset_set_range (a, blo, bhi);
 
       i++;
     }
@@ -374,22 +462,35 @@ charsets_intersection (scm_t_char_set *a, scm_t_char_set 
*b)
   return;
 }
 
+#define SCM_ADD_RANGE(low, high)                        \
+  do {                                                  \
+    p->ranges[idx].lo = (low);                          \
+    p->ranges[idx++].hi = (high);                       \
+  } while (0)
+#define SCM_ADD_RANGE_SKIP_SURROGATES(low, high)                  \
+  do {                                                            \
+    p->ranges[idx].lo = (low);                                    \
+    p->ranges[idx++].hi = SCM_CODEPOINT_SURROGATE_START - 1;      \
+    p->ranges[idx].lo = SCM_CODEPOINT_SURROGATE_END + 1;          \
+    p->ranges[idx++].hi = (high);                                 \
+  } while (0)
+
+
+
 /* Make P the compelement of Q.  */
 static void
 charsets_complement (scm_t_char_set *p, scm_t_char_set *q)
 {
   int k, idx;
 
+  idx = 0;
   if (q->len == 0)
     {
       /* Fill with all valid codepoints.  */
       p->len = 2;
       p->ranges = scm_gc_malloc (sizeof (scm_t_char_range) * 2,
                                  "character-set");
-      p->ranges[0].lo = 0;
-      p->ranges[0].hi = SCM_CODEPOINT_SURROGATE_START - 1;
-      p->ranges[1].lo = SCM_CODEPOINT_SURROGATE_END + 1;
-      p->ranges[1].hi = SCM_CODEPOINT_MAX;
+      SCM_ADD_RANGE_SKIP_SURROGATES (0, SCM_CODEPOINT_MAX);
       return;
     }
 
@@ -397,33 +498,42 @@ charsets_complement (scm_t_char_set *p, scm_t_char_set *q)
     scm_gc_free (p->ranges, sizeof (scm_t_char_set) * p->len,
                  "character-set");
 
+  /* Count the number of ranges needed for the output.  */
   p->len = 0;
   if (q->ranges[0].lo > 0)
     p->len++;
   if (q->ranges[q->len - 1].hi < SCM_CODEPOINT_MAX)
     p->len++;
-  p->len += q->len - 1;
+  p->len += q->len;
   p->ranges =
     (scm_t_char_range *) scm_gc_malloc (sizeof (scm_t_char_range) * p->len,
                                         "character-set");
-  idx = 0;
   if (q->ranges[0].lo > 0)
     {
-      p->ranges[idx].lo = 0;
-      p->ranges[idx++].hi = q->ranges[0].lo - 1;
+      if (q->ranges[0].lo > SCM_CODEPOINT_SURROGATE_END)
+        SCM_ADD_RANGE_SKIP_SURROGATES (0, q->ranges[0].lo - 1);
+      else
+        SCM_ADD_RANGE (0, q->ranges[0].lo - 1);
     }
   for (k = 1; k < q->len; k++)
     {
-      p->ranges[idx].lo = q->ranges[k - 1].hi + 1;
-      p->ranges[idx++].hi = q->ranges[k].lo - 1;
+      if (q->ranges[k - 1].hi < SCM_CODEPOINT_SURROGATE_START
+          && q->ranges[k].lo - 1 > SCM_CODEPOINT_SURROGATE_END)
+        SCM_ADD_RANGE_SKIP_SURROGATES (q->ranges[k - 1].hi + 1, 
q->ranges[k].lo - 1);
+      else
+        SCM_ADD_RANGE (q->ranges[k - 1].hi + 1, q->ranges[k].lo - 1);
     }
   if (q->ranges[q->len - 1].hi < SCM_CODEPOINT_MAX)
     {
-      p->ranges[idx].lo = q->ranges[q->len - 1].hi + 1;
-      p->ranges[idx].hi = SCM_CODEPOINT_MAX;
+      if (q->ranges[q->len - 1].hi < SCM_CODEPOINT_SURROGATE_START)
+        SCM_ADD_RANGE_SKIP_SURROGATES (q->ranges[q->len - 1].hi + 1, 
SCM_CODEPOINT_MAX);
+      else
+        SCM_ADD_RANGE (q->ranges[q->len - 1].hi + 1, SCM_CODEPOINT_MAX);
     }
   return;
 }
+#undef SCM_ADD_RANGE
+#undef SCM_ADD_RANGE_SKIP_SURROGATES
 
 /* Replace A with elements only found in one of A or B.  */
 static void
@@ -1201,7 +1311,7 @@ SCM_DEFINE (scm_char_set_filter_x, "char-set-filter!", 3, 
0, 0,
   for (k = 0; k < p->len; k++)
     for (n = p->ranges[k].lo; n <= p->ranges[k].hi; n++)
       {
-        SCM res = scm_call_1 (pred, SCM_MAKE_CHAR (k));
+        SCM res = scm_call_1 (pred, SCM_MAKE_CHAR (n));
 
         if (scm_is_true (res))
           SCM_CHARSET_SET (base_cs, n);
@@ -1211,27 +1321,18 @@ SCM_DEFINE (scm_char_set_filter_x, "char-set-filter!", 
3, 0, 0,
 #undef FUNC_NAME
 
 
-SCM_DEFINE (scm_ucs_range_to_char_set, "ucs-range->char-set", 2, 2, 0,
-           (SCM lower, SCM upper, SCM error, SCM base_cs),
-           "Return a character set containing all characters whose\n"
-           "character codes lie in the half-open range\n"
-           "address@hidden,@var{upper}).\n"
-           "\n"
-           "If @var{error} is a true value, an error is signalled if the\n"
-           "specified range contains characters which are not contained in\n"
-           "the implemented character range.  If @var{error} is @code{#f},\n"
-           "these characters are silently left out of the resultung\n"
-           "character set.\n"
-           "\n"
-           "The characters in @var{base_cs} are added to the result, if\n"
-           "given.")
-#define FUNC_NAME s_scm_ucs_range_to_char_set
+/* Return a character set containing all the characters from [LOWER,UPPER),
+   giving range errors if ERROR, adding chars from BASE_CS, and recycling
+   BASE_CS if REUSE is true.  */
+static SCM
+scm_i_ucs_range_to_char_set (const char *FUNC_NAME, SCM lower, SCM upper, 
+                             SCM error, SCM base_cs, int reuse)
 {
   SCM cs;
   size_t clower, cupper;
 
   clower = scm_to_size_t (lower);
-  cupper = scm_to_size_t (upper);
+  cupper = scm_to_size_t (upper) - 1;
   SCM_ASSERT_RANGE (2, upper, cupper >= clower);
   if (!SCM_UNBNDP (error))
     {
@@ -1239,28 +1340,66 @@ SCM_DEFINE (scm_ucs_range_to_char_set, 
"ucs-range->char-set", 2, 2, 0,
         {
           SCM_ASSERT_RANGE (1, lower, SCM_IS_UNICODE_CHAR (clower));
           SCM_ASSERT_RANGE (2, upper, SCM_IS_UNICODE_CHAR (cupper));
+          if (clower < SCM_CODEPOINT_SURROGATE_START 
+              && cupper > SCM_CODEPOINT_SURROGATE_END)
+            scm_error(scm_out_of_range_key,
+                      FUNC_NAME, "invalid range - contains surrogate 
characters: ~S to ~S",
+                      scm_list_2 (lower, upper), scm_list_1 (upper));
         }
     }
-  if (clower > 0x10FFFF)
-    clower = 0x10FFFF;
-  if (cupper > 0x10FFFF)
-    cupper = 0x10FFFF;
+
   if (SCM_UNBNDP (base_cs))
     cs = make_char_set (FUNC_NAME);
   else
     {
       SCM_VALIDATE_SMOB (4, base_cs, charset);
-      cs = scm_char_set_copy (base_cs);
+      if (reuse)
+        cs = base_cs;
+      else
+        cs = scm_char_set_copy (base_cs);
     }
-  /* It not be difficult to write a more optimized version of the
-     following.  */
-  while (clower < cupper)
+
+  if ((clower >= SCM_CODEPOINT_SURROGATE_START && clower <= 
SCM_CODEPOINT_SURROGATE_END)
+      && (cupper >= SCM_CODEPOINT_SURROGATE_START && cupper <= 
SCM_CODEPOINT_SURROGATE_END))
+    return cs;
+
+  if (clower > SCM_CODEPOINT_MAX)
+    clower = SCM_CODEPOINT_MAX;
+  if (clower >= SCM_CODEPOINT_SURROGATE_START  && clower <= 
SCM_CODEPOINT_SURROGATE_END)
+    clower = SCM_CODEPOINT_SURROGATE_END + 1;
+  if (cupper > SCM_CODEPOINT_MAX)
+    cupper = SCM_CODEPOINT_MAX;
+  if (cupper >= SCM_CODEPOINT_SURROGATE_START && cupper <= 
SCM_CODEPOINT_SURROGATE_END)
+    cupper = SCM_CODEPOINT_SURROGATE_START - 1;
+  if (clower < SCM_CODEPOINT_SURROGATE_START && cupper > 
SCM_CODEPOINT_SURROGATE_END)
     {
-      SCM_CHARSET_SET (cs, clower);
-      clower++;
+      scm_i_charset_set_range (SCM_CHARSET_DATA (cs), clower, 
SCM_CODEPOINT_SURROGATE_START - 1);
+      scm_i_charset_set_range (SCM_CHARSET_DATA (cs), 
SCM_CODEPOINT_SURROGATE_END + 1, cupper);
     }
+  else
+    scm_i_charset_set_range (SCM_CHARSET_DATA (cs), clower, cupper);
   return cs;
 }
+
+SCM_DEFINE (scm_ucs_range_to_char_set, "ucs-range->char-set", 2, 2, 0,
+           (SCM lower, SCM upper, SCM error, SCM base_cs),
+           "Return a character set containing all characters whose\n"
+           "character codes lie in the half-open range\n"
+           "address@hidden,@var{upper}).\n"
+           "\n"
+           "If @var{error} is a true value, an error is signalled if the\n"
+           "specified range contains characters which are not valid\n"
+           "Unicode code points.  If @var{error} is @code{#f},\n"
+           "these characters are silently left out of the resultung\n"
+           "character set.\n"
+           "\n"
+           "The characters in @var{base_cs} are added to the result, if\n"
+           "given.")
+#define FUNC_NAME s_scm_ucs_range_to_char_set
+{
+  return scm_i_ucs_range_to_char_set (FUNC_NAME, lower, upper, 
+                                      error, base_cs, 0);
+}
 #undef FUNC_NAME
 
 
@@ -1280,28 +1419,9 @@ SCM_DEFINE (scm_ucs_range_to_char_set_x, 
"ucs-range->char-set!", 4, 0, 0,
            "returned.")
 #define FUNC_NAME s_scm_ucs_range_to_char_set_x
 {
-  size_t clower, cupper;
-
-  clower = scm_to_size_t (lower);
-  cupper = scm_to_size_t (upper);
-  SCM_ASSERT_RANGE (2, upper, cupper >= clower);
-  if (scm_is_true (error))
-    {
-      SCM_ASSERT_RANGE (1, lower, SCM_IS_UNICODE_CHAR (clower));
-      SCM_ASSERT_RANGE (2, upper, SCM_IS_UNICODE_CHAR (cupper));
-    }
-  if (clower > SCM_CODEPOINT_MAX)
-    clower = SCM_CODEPOINT_MAX;
-  if (cupper > SCM_CODEPOINT_MAX)
-    cupper = SCM_CODEPOINT_MAX;
-
-  while (clower < cupper)
-    {
-      if (SCM_IS_UNICODE_CHAR (clower))
-        SCM_CHARSET_SET (base_cs, clower);
-      clower++;
-    }
-  return base_cs;
+  SCM_VALIDATE_SMOB (4, base_cs, charset);  
+  return scm_i_ucs_range_to_char_set (FUNC_NAME, lower, upper, 
+                                      error, base_cs, 1);
 }
 #undef FUNC_NAME
 
@@ -1495,7 +1615,9 @@ SCM_DEFINE (scm_char_set_any, "char-set-any", 2, 0, 0,
   SCM_VALIDATE_PROC (1, pred);
   SCM_VALIDATE_SMOB (2, cs, charset);
 
-  cs_data = (scm_t_char_set *) cs;
+  cs_data = SCM_CHARSET_DATA (cs);
+  if (cs_data->len == 0)
+    return SCM_BOOL_T;
 
   for (k = 0; k < cs_data->len; k++)
     for (n = cs_data->ranges[k].lo; n <= cs_data->ranges[k].hi; n++)
@@ -1859,7 +1981,8 @@ SCM_DEFINE (scm_char_set_xor_x, "char-set-xor!", 1, 0, 1,
      (char-set-xor a a a) -> char set #\a
      (char-set-xor! a a a) -> char set #\a
    */
-  return scm_char_set_xor (scm_cons (cs1, rest));
+  cs1 = scm_char_set_xor (scm_cons (cs1, rest));
+  return cs1;
 }
 #undef FUNC_NAME
 
diff --git a/test-suite/tests/srfi-14.test b/test-suite/tests/srfi-14.test
index 56c944a..f12a255 100644
--- a/test-suite/tests/srfi-14.test
+++ b/test-suite/tests/srfi-14.test
@@ -53,6 +53,214 @@
            (char-set->list (char-set #\a #\c #\e))
            (list #\a #\c #\e))))
           
+(with-test-prefix "char set additition"
+
+  (pass-if "empty + x"
+    (let ((cs (char-set)))
+      (char-set-adjoin! cs #\x)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\x))))
+
+  (pass-if "x + y"
+    (let ((cs (char-set #\x)))
+      (char-set-adjoin! cs #\y)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\x #\y))))
+
+  (pass-if "x + w"
+    (let ((cs (char-set #\x)))
+      (char-set-adjoin! cs #\w)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\w #\x))))
+
+  (pass-if "x + z"
+    (let ((cs (char-set #\x)))
+      (char-set-adjoin! cs #\z)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\x #\z))))
+
+  (pass-if "x + v"
+    (let ((cs (char-set #\x)))
+      (char-set-adjoin! cs #\v)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\v #\x))))
+
+  (pass-if "uv + w"
+    (let ((cs (char-set #\u #\v)))
+      (char-set-adjoin! cs #\w)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\u #\v #\w))))
+
+  (pass-if "uv + t"
+    (let ((cs (char-set #\u #\v)))
+      (char-set-adjoin! cs #\t)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\t #\u #\v))))
+
+  (pass-if "uv + x"
+    (let ((cs (char-set #\u #\v)))
+      (char-set-adjoin! cs #\x)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\u #\v #\x))))
+
+  (pass-if "uv + s"
+    (let ((cs (char-set #\u #\v)))
+      (char-set-adjoin! cs #\s)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\s #\u #\v))))
+
+  (pass-if "uvx + w"
+    (let ((cs (char-set #\u #\v #\x)))
+      (char-set-adjoin! cs #\w)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\u #\v #\w #\x))))
+
+  (pass-if "uvx + y"
+    (let ((cs (char-set #\u #\v #\x)))
+      (char-set-adjoin! cs #\y)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\u #\v #\x #\y))))
+
+  (pass-if "uvxy + w"
+    (let ((cs (char-set #\u #\v #\x #\y)))
+      (char-set-adjoin! cs #\w)
+      (list= eqv? 
+             (char-set->list cs)
+             (list #\u #\v #\w #\x #\y)))))
+
+(with-test-prefix "char set union"
+  (pass-if "null U abc"
+    (char-set= (char-set-union (char-set) (->char-set "abc"))
+               (->char-set "abc")))
+
+  (pass-if "ab U ab"
+    (char-set= (char-set-union (->char-set "ab") (->char-set "ab"))
+               (->char-set "ab")))
+
+  (pass-if "ab U bc"
+    (char-set= (char-set-union (->char-set "ab") (->char-set "bc"))
+               (->char-set "abc")))
+
+  (pass-if "ab U cd"
+    (char-set= (char-set-union (->char-set "ab") (->char-set "cd"))
+               (->char-set "abcd")))
+
+  (pass-if "ab U de"
+    (char-set= (char-set-union (->char-set "ab") (->char-set "de"))
+               (->char-set "abde")))
+
+  (pass-if "abc U bcd"
+    (char-set= (char-set-union (->char-set "abc") (->char-set "bcd"))
+               (->char-set "abcd")))
+
+  (pass-if "abdf U abcdefg"
+    (char-set= (char-set-union (->char-set "abdf") (->char-set "abcdefg"))
+               (->char-set "abcdefg")))
+
+  (pass-if "abef U cd"
+    (char-set= (char-set-union (->char-set "abef") (->char-set "cd"))
+               (->char-set "abcdef")))
+
+  (pass-if "abgh U cd"
+    (char-set= (char-set-union (->char-set "abgh") (->char-set "cd"))
+               (->char-set "abcdgh")))
+
+  (pass-if "bc U ab"
+    (char-set= (char-set-union (->char-set "bc") (->char-set "ab"))
+               (->char-set "abc")))
+
+  (pass-if "cd U ab"
+    (char-set= (char-set-union (->char-set "cd") (->char-set "ab"))
+               (->char-set "abcd")))
+
+  (pass-if "de U ab"
+    (char-set= (char-set-union (->char-set "de") (->char-set "ab"))
+               (->char-set "abde")))
+
+  (pass-if "cd U abc"
+    (char-set= (char-set-union (->char-set "cd") (->char-set "abc"))
+               (->char-set "abcd")))
+
+  (pass-if "cd U abcd"
+    (char-set= (char-set-union (->char-set "cd") (->char-set "abcd"))
+               (->char-set "abcd")))
+
+  (pass-if "cde U abcdef"
+    (char-set= (char-set-union (->char-set "cde") (->char-set "abcdef"))
+               (->char-set "abcdef"))))
+
+(with-test-prefix "char set xor"
+  (pass-if "null - xy"
+    (char-set= (char-set-xor (char-set) (char-set #\x #\y))
+               (char-set #\x #\y)))
+
+  (pass-if "x - x"
+    (char-set= (char-set-xor (char-set #\x) (char-set #\x))
+               (char-set)))
+
+  (pass-if "xy - x"
+    (char-set= (char-set-xor (char-set #\x #\y) (char-set #\x))
+               (char-set #\y)))
+
+  (pass-if "xy - y"
+    (char-set= (char-set-xor (char-set #\x #\y) (char-set #\y))
+               (char-set #\x)))
+
+  (pass-if "wxy - w"
+    (char-set= (char-set-xor (char-set #\w #\x #\y) (char-set #\w))
+               (char-set #\x #\y)))
+
+  (pass-if "wxy - x"
+    (char-set= (char-set-xor (char-set #\w #\x #\y) (char-set #\x))
+               (char-set #\w #\y)))
+
+  (pass-if "wxy - y"
+    (char-set= (char-set-xor (char-set #\w #\x #\y) (char-set #\y))
+               (char-set #\w #\x)))
+
+  (pass-if "uvxy - u"
+    (char-set= (char-set-xor (char-set #\u #\v #\x #\y) (char-set #\u))
+               (char-set #\v #\x #\y)))
+
+  (pass-if "uvxy - v"
+    (char-set= (char-set-xor (char-set #\u #\v #\x #\y) (char-set #\v))
+               (char-set #\u #\x #\y)))
+
+  (pass-if "uvxy - x"
+    (char-set= (char-set-xor (char-set #\u #\v #\x #\y) (char-set #\x))
+               (char-set #\u #\v #\y)))
+
+  (pass-if "uvxy - y"
+    (char-set= (char-set-xor (char-set #\u #\v #\x #\y) (char-set #\y))
+               (char-set #\u #\v #\x)))
+
+  (pass-if "uwy - u"
+    (char-set= (char-set-xor (char-set #\u #\w #\y) (char-set #\u))
+               (char-set #\w #\y)))
+
+  (pass-if "uwy - w"
+    (char-set= (char-set-xor (char-set #\u #\w #\y) (char-set #\w))
+               (char-set #\u #\y)))
+
+  (pass-if "uwy - y"
+    (char-set= (char-set-xor (char-set #\u #\w #\y) (char-set #\y))
+               (char-set #\u #\w)))
+
+  (pass-if "uvwy - v"
+    (char-set= (char-set-xor (char-set #\u #\v #\w #\y) (char-set #\v))
+               (char-set #\u #\w #\y))))
+
 
 (with-test-prefix "char-set?"
 
@@ -83,7 +291,10 @@
     (not (char-set= (char-set #\a) (char-set))))
 
   (pass-if "success, more args"
-    (char-set= char-set:blank char-set:blank char-set:blank)))
+    (char-set= char-set:blank char-set:blank char-set:blank))
+
+  (pass-if "failure, same length, different elements"
+    (not (char-set= (char-set #\a #\b #\d) (char-set #\a #\c #\d)))))
 
 (with-test-prefix "char-set<="
   (pass-if "success, no arg"
@@ -238,6 +449,199 @@
        (string=? (char-set->string cs)
                  "egilu"))))
 
+(with-test-prefix "list->char-set"
+
+  (pass-if "list->char-set"
+    (char-set= (list->char-set '(#\a #\b #\c))
+               (->char-set "abc")))
+
+  (pass-if "list->char-set!"
+    (let* ((cs (char-set #\a #\z)))
+      (list->char-set! '(#\m #\n) cs)
+      (char-set= cs
+                 (char-set #\a #\m #\n #\z)))))
+
+(with-test-prefix "string->char-set"
+
+  (pass-if "string->char-set"
+    (char-set= (string->char-set "foobar")
+               (string->char-set "barfoo")))
+
+  (pass-if "string->char-set cs"
+    (char-set= (string->char-set "foo" (string->char-set "bar"))
+               (string->char-set "barfoo")))
+
+  (pass-if "string->char-set!"
+    (let ((cs (string->char-set "bar")))
+      (string->char-set! "foo" cs)
+      (char-set= cs
+                 (string->char-set "barfoo")))))
+
+(with-test-prefix "char-set-filter"
+
+  (pass-if "filter w/o base"
+    (char-set=
+     (char-set-filter (lambda (c) (char=? c #\x))
+                      (->char-set "qrstuvwxyz"))
+     (->char-set #\x)))
+
+  (pass-if "filter w/ base"
+    (char-set=
+     (char-set-filter (lambda (c) (char=? c #\x))
+                      (->char-set "qrstuvwxyz")
+                      (->char-set "op"))
+                      
+     (->char-set "opx")))
+
+  (pass-if "filter!"
+    (let ((cs (->char-set "abc")))
+      (set! cs (char-set-filter! (lambda (c) (char=? c #\x))
+                                 (->char-set "qrstuvwxyz")
+                                 cs))
+      (char-set= (string->char-set "abcx")
+                 cs))))
+
+
+(with-test-prefix "char-set-intersection"
+
+  (pass-if "empty"
+    (char-set= (char-set-intersection (char-set) (char-set))
+               (char-set)))
+
+  (pass-if "identical, one element"
+    (char-set= (char-set-intersection (char-set #\a) (char-set #\a))
+               (char-set #\a)))
+
+  (pass-if "identical, two elements"
+    (char-set= (char-set-intersection (char-set #\a #\b) (char-set #\a #\b))
+               (char-set #\a #\b)))
+
+  (pass-if "identical, two elements"
+    (char-set= (char-set-intersection (char-set #\a #\c) (char-set #\a #\c))
+               (char-set #\a #\c)))
+
+  (pass-if "one vs null"
+    (char-set= (char-set-intersection (char-set #\a) (char-set))
+               (char-set)))
+
+  (pass-if "null vs one"
+    (char-set= (char-set-intersection (char-set) (char-set #\a))
+               (char-set)))
+
+  (pass-if "no elements shared"
+    (char-set= (char-set-intersection (char-set #\a #\c) (char-set #\b #\d))
+               (char-set)))
+
+  (pass-if "one elements shared"
+    (char-set= (char-set-intersection (char-set #\a #\c #\d) (char-set #\b 
#\d))
+               (char-set #\d))))
+
+(with-test-prefix "char-set-complement"
+
+  (pass-if "complement of null"
+           (char-set= (char-set-complement (char-set))
+                      (char-set-union (ucs-range->char-set 0 #xd800)
+                                      (ucs-range->char-set #xe000 #x110000))))
+
+  (pass-if "complement of null (2)"
+           (char-set= (char-set-complement (char-set))
+                      (ucs-range->char-set 0 #x110000)))
+
+  (pass-if "complement of #\\0"
+           (char-set= (char-set-complement (char-set #\nul))
+                      (ucs-range->char-set 1 #x110000)))
+
+  (pass-if "complement of U+10FFFF"
+           (char-set= (char-set-complement (char-set (integer->char #x10ffff)))
+                      (ucs-range->char-set 0 #x10ffff)))
+
+  (pass-if "complement of 'FOO'"
+           (char-set= (char-set-complement (->char-set "FOO"))
+                      (char-set-union (ucs-range->char-set 0 (char->integer 
#\F))
+                                      (ucs-range->char-set (char->integer #\G) 
+                                                           (char->integer #\O))
+                                      (ucs-range->char-set (char->integer #\P) 
+                                                            #x110000))))
+  (pass-if "complement of #\\a #\\b U+010300"
+           (char-set= (char-set-complement (char-set #\a #\b (integer->char 
#x010300)))
+                      (char-set-union (ucs-range->char-set 0 (char->integer 
#\a))
+                                      (ucs-range->char-set (char->integer #\c) 
#x010300)
+                                      (ucs-range->char-set #x010301 
#x110000)))))
+
+(with-test-prefix "ucs-range->char-set"
+  (pass-if "char-set"
+    (char-set= (ucs-range->char-set 65 68)
+               (->char-set "ABC")))
+
+  (pass-if "char-set w/ base"
+    (char-set= (ucs-range->char-set 65 68 #f (->char-set "DEF"))
+               (->char-set "ABCDEF")))
+
+  (pass-if "char-set!"
+    (let ((cs (->char-set "DEF")))
+      (ucs-range->char-set! 65 68 #f cs)
+      (char-set= cs
+                 (->char-set "ABCDEF")))))
+
+(with-test-prefix "char-set-count"
+  (pass-if "null"
+    (= 0 (char-set-count (lambda (c) #t) (char-set))))
+
+  (pass-if "count"
+    (= 5 (char-set-count (lambda (c) #t) 
+                         (->char-set "guile")))))
+
+(with-test-prefix "char-set-contains?"
+  (pass-if "#\\a not in null"
+    (not (char-set-contains? (char-set) #\a)))
+
+  (pass-if "#\\a is in 'abc'"
+    (char-set-contains? (->char-set "abc") #\a)))
+
+(with-test-prefix "any / every"
+  (pass-if "char-set-every #t"
+    (char-set-every (lambda (c) #t) 
+                    (->char-set "abc")))
+
+  (pass-if "char-set-every #f"
+    (not (char-set-every (lambda (c) (char=? c #\c)) 
+                         (->char-set "abc"))))
+
+  (pass-if "char-set-any #t"
+    (char-set-any (lambda (c) (char=? c #\c)) 
+                  (->char-set "abc")))
+
+  (pass-if "char-set-any #f"
+    (not (char-set-any (lambda (c) #f)
+                       (->char-set "abc")))))
+
+(with-test-prefix "char-set-delete"
+  (pass-if "abc - a"
+    (char-set= (char-set-delete (->char-set "abc") #\a)
+               (char-set #\b #\c)))
+
+  (pass-if "abc - d"
+    (char-set= (char-set-delete (->char-set "abc") #\d)
+               (char-set #\a #\b #\c)))
+
+  (pass-if "delete! abc - a"
+    (let ((cs (char-set #\a #\b #\c)))
+      (char-set-delete! cs #\a)
+      (char-set= cs (char-set #\b #\c)))))
+
+(with-test-prefix "char-set-difference"
+  (pass-if "not different"
+    (char-set= (char-set-difference (->char-set "foobar") (->char-set 
"foobar"))
+               (char-set)))
+
+  (pass-if "completely different"
+    (char-set= (char-set-difference (->char-set "foo") (->char-set "bar"))
+               (->char-set "foo")))
+
+  (pass-if "partially different"
+    (char-set= (char-set-difference (->char-set "breakfast") (->char-set 
"breakroom"))
+               (->char-set "fst"))))
+
 (with-test-prefix "standard char sets (ASCII)"
 
   (pass-if "char-set:lower-case"


hooks/post-receive
-- 
GNU Guile




reply via email to

[Prev in Thread] Current Thread [Next in Thread]