[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

C strings and other things!

From: Keith Hopper
Subject: C strings and other things!
Date: Thu, 14 Dec 2000 14:02:53 +1300
User-agent: Pluto/2.02b (RISC-OS/3.60) POPstar/2.02

In article <address@hidden>,
   Norbert Nemec <address@hidden> wrote:
> On Wed, Dec 13, 2000 at 01:37:37AM +0000, Dave Simons wrote:
> > Please, PLEASE, Sathermasters, integrate something into the STR class
> > to give this speed legally, cleanly, and portably!

> In the new library, string handling has been *completely* reworked. Strings 
> are 
> internationalized now, so there is not the one, simple, straightforward 
> conversion to C-strings anymore. I do not know exactly what would be the 
> method to convert string that does correspond to your CSTR class. Problem is 
> that, AFAIK, the bytewise representation of unicode strings is implementation 
> defendant, so just passing a pointer to the memory space where the string is 
> stored will never give you anything portable.

     OK! Let me give a brief explanation for those interested.  A string
has associated with it a 'culture and code' - I tend to say that as
'repertoire (of characters) and encoding 'cos those are what is important -
as well as a way of specifying particular 'specials' as used in the general
library string conversion sorts of things.

     If you have a C (shudder, shudder!) string then, in general, it will
have been generated in the default culture for the execution environment. 
Once the full panoply of culture definitions is available and compilable
this is auto-magically determined by the program at run-time - with a hint
from a couple of environment variables.  The class STR has a routine called
create_from_external_string which does just that, given a reference (class
REFERENCE) to the external string it converts to a normal string in the
default repertoire and encoding - as I said above.

     The class REFERENCE permits all sorts of environment specified
'handles to external things' to be used in writing Sather.  The use of 'C'
classes is still, of course, possible, but this should notmally be hidden
by the run-time system.  I will be honest and say that I have written many
thousands of lines of Sather without once using any of the 'C' classes.

     The only time that you really need to do anything special may well be
when interfacing to non-environment libraries or services.  I have never
had a real feel for that - it has just not come up in my use of the

Change of Subject
     In the course of some of the formal specification work which I am
trying to get to grips with I have come across a number of anomalies which
give me cause for concern -- anyone who feels able to comment
constructively on the following will be welcomed with open arms (or
something!!) -

     (1)  I am slowly coming to the conclusion that there is a need for a
class we may call ORD (short for ordinal) which has two operations - succ
and pred (or some such name) with a couple of predicates/values, etc) and
is the only class to contain the times! iter (both variants) and possibly
an 'up!' and upto! variant - and is the class used when indexing containers
of all sorts.  My reasoning comes from the general specification of all
iters  which technically need a history trace (in order to specify them)
for which the empty value is the only value on initial invcation - all
further history is a counted sequence.  This may seem an unnecessary
additional library class, but it will certainly simplify a lot of checking
code generation.  You will certainly realise that the current vogue for
indexing containers starting with zero is conventionally used where a
pointer to a memory location and an offset from that of zero is the first
element in the container.  Since Sather does not have pointers (well even
Pascal provided for arbitrary array indexing) there seems little reason to
perpetuate the use of zero at programmer level (how a compiler implementer
goes about twiddling bits and bytes to do the right thing is only of
concern to them).  Theoretically I thnk that it is the correct thing to do,
as it does making container indices what they are ordinal numbers referring
to the first, second, etc elements - Introducing this will, however,
introduce the possibility of 'off by one' errors in existing code - only a
one off revision, I must hasten to stress - and then a significant saving
in code generated when indexing and limit checking.

     (2)  Concurrency is another area where we do not really have a clean
list of library classes - the set provided are mainly used in
compiler-generated 'call-back' to management functions.  This is, of
course, merely implementation.  We also seem to have semaphores and events
and locks and mutexes and ...  with all of which the same kinds of things
can be done.  I ask, therefore, for suggestions for a clean and simple set
of concurrency abstractions which readily match the language
lock/par/select constructs.

     (3)  My final concern is looking at the container abstractions,
particularly the apparent implementation bias of the abstract types
involved.  To me an abstraction is an abstraction is an abstraction and
cannot have enything to do with implementation.  I have noted over the
years of using Sather that 99 times out of a hundred I use the Fxxx classes
rather than the non-F varieties.  I would like to see the containers,
therefore have a much closer relationship between the Fxxx and the non-F
variants such that they have the same kind of look and feel as strings
(whether text or binary) there is one variant with mutable semantics (the
F-variant, the other having immutable semantics.  I feel that the $Vxxx
abstraction variants are probably unnecessary. 

     I remain to be convinced to go in one direction or another on all of
these points.  I would really like to provide a clean and consistent group
of abstractions for each section of the library.  The three areas above are
the principal ones which may affect existing code.  For other sections
there is merely a change in abstraction specification for some items with
no or very little impact on code already written.

     This whole business of specification has opened my eyes to a lot of
difficulties in the OO world.  I do, however, want to get it right - as I
keep saying to my students - lean, mean, clean and green!

     I am currently tidying up various sections of the library as part of
doing the specification design.  While I could 'go public' with a partially
fixed version before Christmas, I would prefer to talk about the end of
February (I am away for six weeks in China just after Christmas).

     In signing off for further deep thought, may I wish everyone reading
this a very Merryr Christmas and Prosperous entry into the 21st century.

                Regards to all,

                        Keith Hopper

City Desk
Waikato University
[PGP key available if desired]

reply via email to

[Prev in Thread] Current Thread [Next in Thread]