Re: [Patch] SRFI-13 string-tokenize is wrong

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Patch] SRFI-13 string-tokenize is wrong

From:	Matthias Koeppe
Subject:	Re: [Patch] SRFI-13 string-tokenize is wrong
Date:	Mon, 29 Apr 2002 11:21:14 +0200
User-agent:	Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1.80 (sparc-sun-solaris2.7)

Marius Vollmer <address@hidden> writes:

> Thanks; and sorry for being nitpicky: can we be sure that isgraphic is
> the same as charset:graphic?

We can't.  That's why I wrote that TOKEN_SET defaults to "an
equivalent" of CHAR-SET:GRAPHIC.  

The whole internationalization stuff is, of course, broken.  Some
Guile functions depend on the current locale setting; others depend on
the locale setting at load time; others silently do ASCII only.  This
clearly needs to be worked on, but I don't think STRING-TOKENIZE would
be the place to start.

BTW, when I tried to make an example of the described behavior, I got
a segmentation fault caused by an array being indexed by a signed
char (on Solaris 2.7 with the Forte compiler):

     (use-modules (srfi srfi-13) (srfi srfi-14)) 
     (string-tokenize "charsetsäareäfun" char-set:graphic)
     ==> segfault

Here is a fix:

--- srfi-14.h.~1.3.2.6.~        Tue Sep 25 13:00:41 2001
+++ srfi-14.h   Mon Apr 29 11:13:03 2002
@@ -48,15 +48,15 @@
 
 #define SCM_CHARSET_SIZE 256
 
-/* We expect 8-bit bytes here.  Shoule be no problem in the year
+/* We expect 8-bit bytes here.  Should be no problem in the year
    2001.  */
 #ifndef SCM_BITS_PER_LONG
 # define SCM_BITS_PER_LONG (sizeof (long) * 8)
 #endif
 
 #define SCM_CHARSET_GET(cs, idx) (((long *) SCM_SMOB_DATA (cs))\
-                                  [(idx) / SCM_BITS_PER_LONG] &\
-                                  (1L << ((idx) % SCM_BITS_PER_LONG)))
+                                  [((unsigned char) (idx)) / 
SCM_BITS_PER_LONG] &\
+                                  (1L << (((unsigned char) (idx)) % 
SCM_BITS_PER_LONG)))
 
 #define SCM_CHARSETP(x) (!SCM_IMP (x) && (SCM_TYP16 (x) == scm_tc16_charset))
 

-- 
Matthias Köppe -- http://www.math.uni-magdeburg.de/~mkoeppe

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Patch] SRFI-13 string-tokenize is wrong, Marius Vollmer, 2002/04/24
- Re: [Patch] SRFI-13 string-tokenize is wrong, Matthias Koeppe, 2002/04/26
  - Re: [Patch] SRFI-13 string-tokenize is wrong, Marius Vollmer, 2002/04/26
    - Re: [Patch] SRFI-13 string-tokenize is wrong, Matthias Koeppe <=

Prev by Date: Re: How to detect a procedure
Next by Date: Re: How to detect a procedure
Previous by thread: Re: [Patch] SRFI-13 string-tokenize is wrong
Next by thread: Re: snarfer guard macro name decision: SCM_MAGIC_SNARFER
Index(es):
- Date
- Thread