[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c
From: |
Bruno Haible |
Subject: |
Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c |
Date: |
Tue, 28 Jan 2003 21:57:59 +0100 (CET) |
Paul Eggert writes:
> True, but my question is what the symbol C_CTYPE_ASCII means. That
> is, I am trying to understand the implementation, not trying to
> understand the API.
It means: The character set is ASCII or one of its variants or
extensions, not EBCDIC. I've corrected the comment now.
> From your remarks, apparently you mean for C_CTYPE_ASCII to mean "the
> character set is upward compatible with JIS X 0201:1997 left half
> (Japanese JIS Roman)".
Sorry I must have expressed myself wrong. If the character set has
'\\' but it has a different codepoint than in ASCII then the ASCII
optimizations should _not_ apply.
> Conversely, the #if doesn't test for '$' or '@', even those two
> characters are in JIS Roman and your remarks suggest that you intended
> to test for '$' and '@'.
My earlier remarks were wrong. '$' and '@' are not tested, precisely
because these characters are not part of the "basic character
set".
ISO-646-CN is probably not a problem, can be handled like ASCII.
(Whether C_CTYPE_ASCII gets set to 1 on a system with ISO-646-CN or
ISO-646-JP, will depend on the source code conversions that have been
performed on the source file before compilation, maybe converting
backslash to YEN SIGN or maybe not etc. - however it's not a problem
for the c_* functions.)
> Would you be convinced by an efficiency argument?
> On my host (GCC 2.95.3 with -O2, sparc), the unportable code:
>
> int f (int x) { return (x & ~0x7f) == 0; }
>
> requires 4 instructions, but the portable code:
>
> int g (unsigned x) { return x <= 0x7f; }
>
> requires only 3.
OK, why not. On x86 also, the generated code for
int g (int x) { return x >= 0 && x <= 0x7f; }
is smaller.
> Besides, a few ones-complement hosts with C compilers are still in use
> (Unisys mainframes)
Let's hope that they get out of business soon :-) (They would already
have, if they didn't succeed in extorting money from people who
believe in patent threats.)
> Anyway, if it's easy, it's better to avoid code that assumes two's
> complement, since such code is a bit trickier to read
On the contrary, such code is good teaching material for bit
operations. Did you know that for every x
((x - 1) & (- x - 1)) + 1 == x & -x
> > For debugging it is best to use -O0, and in this case "c-ctype.h"
> > will use the external functions, not the macros.
>
> But that's two copies of the code, which have to maintained
> separately. With inline functions you have one less copy of the code,
> so it should be less error-prone.
In general, I agree. In this case here, the functions won't change in
10 years.
Bruno
- [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Bruno Haible, 2003/01/27
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Paul Eggert, 2003/01/27
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Bruno Haible, 2003/01/28
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Paul Eggert, 2003/01/28
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c,
Bruno Haible <=
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Paul Eggert, 2003/01/28
- Re: [Bug-gnulib] addition: c-ctype.h, c-ctype.c, Bruno Haible, 2003/01/29