[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GH replacement proposal (includes a bit of Unicode)

From: Dirk Herrmann
Subject: Re: GH replacement proposal (includes a bit of Unicode)
Date: Sat, 17 Apr 2004 15:21:15 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030821

Hello Marius,

in general, a nice proposal. I have some suggestions and comments, though.

Marius Vollmer wrote:

 - int scm_to_bool (SCM);
 - int scm_is_true (SCM);

   Return 0 when SCM is SCM_BOOL_F, else return 1.

I would only provide the scm_to_bool variant. There is IMO no benefit in having two functions which do exactly the same thing. Providing these two might give the impression that there is some difference in them. Further, there is always the question whether scm_is_true checks for any true value or explicitly for SCM_BOOL_T. Things would be clearer if instead of scm_is_true the function provided would be scm_is_false, but given scm_to_bool, I see no reason for providing it. Thus, just leave it with scm_to_bool.

** Integers
 - SCM scm_is_integer (SCM val);

Paul has already commented on the return type typo.

 - scm_t_intmax scm_to_signed_integer (SCM val,
                                       scm_t_intmax min, scm_t_intmax max);
 - scm_t_uintmax scm_to_unsigned_integer (SCM val, scm_t_uintmax max);

Since ISO-C99 specifies the intmax_t and uintmax_t typedefs, I would prefer to have functions scm_to_intmax and scm_to_uintmax, and have them behave similarly to scm_to_int etc. which you describe below. IMO, the boundary checking feature is somewhat too specific. This does, however, not mean that internally we could not use a helper function / macro that implements the functions in this way.

In addition, it may make sense to use the generic type names signed integer and unsigned integer to provide additional functions scm_to_signed_integer and scm_to_unsigned_integer that implement the very general conversion idea that Gary Houston had proposed in

   Convert the SCM value VAL to a C integer when it is representable
   and when it is between min and max inclusive, or between 0 and max
   inclusive.  Signal an error when it isn't.  [...]

I think your suggestion that scm_to_<integer-type> should throw an error if the value can not be represented is good, since such errors shall not go unnoticed. For development convenience, however, I would like to have an easy possibility to figure out whether a value can be represented or not, since it is a lot of fuss for a developer to capture errors. Thus we should consider the following possibilities for extension / modification of your proposal:

Alternative 1:
* change the functions in the following way:
 <type> scm_to_<type> (SCM value, int *success)
Instead of signalling an error, *success indicates whether the value can be represented. If *success is 0, the returned value is unspecified. If success is NULL, an error is signalled for the case that the value can not be represented.
Alternative 2:
* provide the following additional functions:
 <type> scm_to_<type>_2 (SCM value, int *success)
I have not yet an idea for a good name, thus I have just added the _2 postfix.
Alternative 3:
* provide the following additional functions:
 int scm_fits_<type> (SCM value);
Return 1 if the value can be converted to a C value of type <type>, 0 otherwise. If scm_fits_<type> returns 1 for some value it is guaranteed, that a subsequent call to scm_to_<type> does not signal an error.

The advantage of alternative 1 is, that it does not add additional functions to the API. Further, it is always clear to the user that the conversion may fail. If the user does not care about failure, it is possible to pass the NULL pointer for the success argument.

The advantage of alternative 2 is, that the user may choose whether the error checking is of interest or not. Further, the scm_to_<type> functions can easily be implemented using the scm_to_<type>_2 functions. The disadvantage is, that there are a lot of additional API functions, and that it is not easy to find good names to distinguish between the two kinds of functions.

The advantage of alternative 3 is, that the scm_fits_<type> check may be used as a predicate, even if no actual conversion is desired by the user. The disadvantage of alternative 3 is, that for a lot of code the checking will have to be performed twice: The user will first call scm_fits_<type> and then scm_to_<type>. Both, however, will check whether the value fits.

Thus, I would prefer either alternative 1 or 2, favoring alternative 1.

** Floating point numbers

We don't go to such a great length to cover all possible types
here. "double" ought to be enough, no?

According to ISO-C99 there are the types float, double and long double. For the moment I agree that double would be sufficient. And, since the naming pattern is quite symmetric, it would not be a problem to extend it to float and long double, if there would be interest in the community.

 - SCM scm_from_complex_double (double re, double im);
 - double scm_to_real_part_double (SCM z);
 - double scm_to_imag_part_double (SCM z);

But remember to use the generic functions scm_make_rectangular,
scm_real_part, etc if you don't care whether the parts of a complex
number are floating point numbers or not.  For example, Guile might
someday offer complex numbers where the real part is a fraction
(currently it is always a double) and it is good to be prepared for
this by not treating the parts of a complex as doubles when it is not

We should be prepared to provide conversion functions for the new ISO-C99 types float _Complex, double _Complex, long double _Complex, float _Imaginary, double _Imaginary and long double _Imaginary. Thus, the naming scheme used above seems a bit confusing if we later expect a function scm_from_double_complex to be added. What about using the pattern scm_from_<type>_r_<type>_i if a complex is to be constructed from separately given real and imaginary parts, and scm_from_<type> if a complex is to be constructed from one value? This would allow for a lot of combinations, e. g. the first function from your proposal would be renamed to scm_from_double_r_double_i, other possibilities could be scm_from_float_complex, scm_from_double_r_imaginary_i etc.. Not that these are needed today, but the naming scheme should from the start be designed to be open for such extensions.

** Characters

A Scheme character in Guile is equivalent to a Unicode code point.


 - SCM scm_from_locale_string (unsigned char *str, ssize_t len);

 Return a new Scheme string initialized with STR, a string encoded
 according to the current locale.  When LEN is -1, STR must be
 zero-terminated and its length is found that way.  Otherwise LEN
 gives the length of STR.

I would prefer to have two functions like scm_from_locale_memory (with an unsigned len argument) and scm_from_locale_c_string rather than using -1 as a magic number. The same holds for the other scm_from_<string-type> functions that you describe below.

 - unsigned char *scm_to_locale_string (SCM str, size_t *lenp);

 Convert STR into a C string that is encoded as specified by the
 current locale.  Memory is allocated for the C string that can be
 freed with 'free'.

 When the current locale can not encode STR, an error is signalled.

 When LENP is not NULL, the number of bytes contained in the returned
 string is stored in *LENP.  The string is zero-terminated, but it
 might contain zero characters in the middle.

Is the terminal zero counted or not?

The above functions always return newly allocated memory.  When that
is deemed too expensive, the following functions can be used instead.
However, care must be taken to use them correctly and reasonably.

 - scm_lock_heap ();
 - scm_unlock_heap ();

I urgently suggest that we do not provide such a concept. It makes too many implementation details visible to the user (the way, strings are implemented internally) and has influences on critical system parts (memory management). It is not foreseeable in which way this may inhibit further development. From my work on the evaluator I can tell you how much leaked implementation details inhibit improvements.

Best regards

reply via email to

[Prev in Thread] Current Thread [Next in Thread]