Re: GH replacement proposal (includes a bit of Unicode)

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GH replacement proposal (includes a bit of Unicode)

From:	Dirk Herrmann
Subject:	Re: GH replacement proposal (includes a bit of Unicode)
Date:	Sat, 17 Apr 2004 15:21:15 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030821

Hello Marius,

in general, a nice proposal. I have some suggestions and comments, though.

Marius Vollmer wrote:

 - int scm_to_bool (SCM);
 - int scm_is_true (SCM);

   Return 0 when SCM is SCM_BOOL_F, else return 1.

I would only provide the scm_to_bool variant. There is IMO no benefit inhaving two functions which do exactly the same thing. Providing thesetwo might give the impression that there is some difference in them.Further, there is always the question whether scm_is_true checks for anytrue value or explicitly for SCM_BOOL_T. Things would be clearer ifinstead of scm_is_true the function provided would be scm_is_false, butgiven scm_to_bool, I see no reason for providing it. Thus, just leave itwith scm_to_bool.

** Integers

 - SCM scm_is_integer (SCM val);

Paul has already commented on the return type typo.

 - scm_t_intmax scm_to_signed_integer (SCM val,
                                       scm_t_intmax min, scm_t_intmax max);
 - scm_t_uintmax scm_to_unsigned_integer (SCM val, scm_t_uintmax max);

Since ISO-C99 specifies the intmax_t and uintmax_t typedefs, I wouldprefer to have functions scm_to_intmax and scm_to_uintmax, and have thembehave similarly to scm_to_int etc. which you describe below. IMO, theboundary checking feature is somewhat too specific. This does, however,not mean that internally we could not use a helper function / macro thatimplements the functions in this way.

In addition, it may make sense to use the generic type names signedinteger and unsigned integer to provide additional functionsscm_to_signed_integer and scm_to_unsigned_integer that implement thevery general conversion idea that Gary Houston had proposed inhttp://mail.gnu.org/archive/html/guile-devel/2001-09/msg00290.html.

   Convert the SCM value VAL to a C integer when it is representable
   and when it is between min and max inclusive, or between 0 and max
   inclusive.  Signal an error when it isn't.  [...]

I think your suggestion that scm_to_<integer-type> should throw an errorif the value can not be represented is good, since such errors shall notgo unnoticed. For development convenience, however, I would like to havean easy possibility to figure out whether a value can be represented ornot, since it is a lot of fuss for a developer to capture errors. Thuswe should consider the following possibilities for extension /modification of your proposal:


Alternative 1:
* change the functions in the following way:
 <type> scm_to_<type> (SCM value, int *success)

Instead of signalling an error, *success indicates whether the valuecan be represented. If *success is 0, the returned value is unspecified.If success is NULL, an error is signalled for the case that the valuecan not be represented.

Alternative 2:
* provide the following additional functions:
 <type> scm_to_<type>_2 (SCM value, int *success)

I have not yet an idea for a good name, thus I have just added the _2postfix.

Alternative 3:
* provide the following additional functions:
 int scm_fits_<type> (SCM value);

Return 1 if the value can be converted to a C value of type <type>, 0otherwise. If scm_fits_<type> returns 1 for some value it is guaranteed,that a subsequent call to scm_to_<type> does not signal an error.

The advantage of alternative 1 is, that it does not add additionalfunctions to the API. Further, it is always clear to the user that theconversion may fail. If the user does not care about failure, it ispossible to pass the NULL pointer for the success argument.

The advantage of alternative 2 is, that the user may choose whether theerror checking is of interest or not. Further, the scm_to_<type>functions can easily be implemented using the scm_to_<type>_2 functions.The disadvantage is, that there are a lot of additional API functions,and that it is not easy to find good names to distinguish between thetwo kinds of functions.

The advantage of alternative 3 is, that the scm_fits_<type> check may beused as a predicate, even if no actual conversion is desired by theuser. The disadvantage of alternative 3 is, that for a lot of code thechecking will have to be performed twice: The user will first callscm_fits_<type> and then scm_to_<type>. Both, however, will checkwhether the value fits.


Thus, I would prefer either alternative 1 or 2, favoring alternative 1.

** Floating point numbers

We don't go to such a great length to cover all possible types
here. "double" ought to be enough, no?

According to ISO-C99 there are the types float, double and long double.For the moment I agree that double would be sufficient. And, since thenaming pattern is quite symmetric, it would not be a problem to extendit to float and long double, if there would be interest in the community.

 - SCM scm_from_complex_double (double re, double im);
 - double scm_to_real_part_double (SCM z);
 - double scm_to_imag_part_double (SCM z);

But remember to use the generic functions scm_make_rectangular,
scm_real_part, etc if you don't care whether the parts of a complex
number are floating point numbers or not.  For example, Guile might
someday offer complex numbers where the real part is a fraction
(currently it is always a double) and it is good to be prepared for
this by not treating the parts of a complex as doubles when it is not
needed.

We should be prepared to provide conversion functions for the newISO-C99 types float _Complex, double _Complex, long double _Complex,float _Imaginary, double _Imaginary and long double _Imaginary. Thus,the naming scheme used above seems a bit confusing if we later expect afunction scm_from_double_complex to be added. What about using thepattern scm_from_<type>_r_<type>_i if a complex is to be constructedfrom separately given real and imaginary parts, and scm_from_<type> if acomplex is to be constructed from one value? This would allow for a lotof combinations, e. g. the first function from your proposal would berenamed to scm_from_double_r_double_i, other possibilities could bescm_from_float_complex, scm_from_double_r_imaginary_i etc.. Not thatthese are needed today, but the naming scheme should from the start bedesigned to be open for such extensions.

** Characters

A Scheme character in Guile is equivalent to a Unicode code point.

Yes!

 - SCM scm_from_locale_string (unsigned char *str, ssize_t len);

 Return a new Scheme string initialized with STR, a string encoded
 according to the current locale.  When LEN is -1, STR must be
 zero-terminated and its length is found that way.  Otherwise LEN
 gives the length of STR.

I would prefer to have two functions like scm_from_locale_memory (withan unsigned len argument) and scm_from_locale_c_string rather than using-1 as a magic number. The same holds for the otherscm_from_<string-type> functions that you describe below.

 - unsigned char *scm_to_locale_string (SCM str, size_t *lenp);

 Convert STR into a C string that is encoded as specified by the
 current locale.  Memory is allocated for the C string that can be
 freed with 'free'.

 When the current locale can not encode STR, an error is signalled.

 When LENP is not NULL, the number of bytes contained in the returned
 string is stored in *LENP.  The string is zero-terminated, but it
 might contain zero characters in the middle.

Is the terminal zero counted or not?

The above functions always return newly allocated memory.  When that
is deemed too expensive, the following functions can be used instead.
However, care must be taken to use them correctly and reasonably.

 - scm_lock_heap ();
 - scm_unlock_heap ();

I urgently suggest that we do not provide such a concept. It makes toomany implementation details visible to the user (the way, strings areimplemented internally) and has influences on critical system parts(memory management). It is not foreseeable in which way this may inhibitfurther development. From my work on the evaluator I can tell you howmuch leaked implementation details inhibit improvements.


Best regards
Dirk

[Prev in Thread]

Current Thread

[Next in Thread]

Re: GH replacement proposal (includes a bit of Unicode), (continued)
- Re: GH replacement proposal (includes a bit of Unicode), Dirk Herrmann <=
  - Re: GH replacement proposal (includes a bit of Unicode), Rob Browning, 2004/04/22
    - Re: GH replacement proposal (includes a bit of Unicode), Dirk Herrmann, 2004/04/22
- Re: GH replacement proposal (includes a bit of Unicode), Rob Browning, 2004/04/22
  - Re: GH replacement proposal (includes a bit of Unicode), Dirk Herrmann, 2004/04/22
    - Re: GH replacement proposal (includes a bit of Unicode), Rob Browning, 2004/04/22
  - Re: GH replacement proposal (includes a bit of Unicode), Marius Vollmer, 2004/04/23
    - Re: GH replacement proposal (includes a bit of Unicode), Rob Browning, 2004/04/23
    - Re: GH replacement proposal (includes a bit of Unicode), Andreas Rottmann, 2004/04/23
    - Re: GH replacement proposal (includes a bit of Unicode), Dirk Herrmann, 2004/04/25

Prev by Date: gcd inum/big simplification
Next by Date: Guile-Lib snapshots available
Previous by thread: Re: GH replacement proposal (includes a bit of Unicode)
Next by thread: Re: GH replacement proposal (includes a bit of Unicode)
Index(es):
- Date
- Thread