bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Miscellaneous code readability improvements.


From: Eric Blake
Subject: Re: [PATCH] Miscellaneous code readability improvements.
Date: Thu, 27 Aug 2009 18:17:37 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Joel E. Denny <jdenny <at> clemson.edu> writes:

> > > +    if (isspace (*p) && isprint (*p))
> > 
> Thanks for catching the problem.  However, I'm a little confused.  My 
> isspace man page says the argument is of type int.  It also says:
> 
>   These  functions  check  whether  c,  which  must  have the value of an
>   unsigned char or EOF, falls into a certain character class according to
>   the current locale.
> 
> Isn't EOF usually -1?  What type includes -1 and doesn't include signed 
> char?

These are okay:

int c = getchar(); // c will be EOF or an unsigned char value
isspace(c);        // c is already an in-range int

char *s;
unsigned char c = *s; // safe, since *s can't be EOF
isspace(c); // c is promoted to int, but promotion of unsigned char is safe

But these are not:

isspace(-2); // out of range, so result is unspecified

char *s;     // if char is signed, and *s has 8th bit set,
isspace(*s); // then *s promotes to int with sign extension, so you
             // could be calling isspace(-2)
isspace((int)*s); // making the cast explicit doesn't change things

True, most libc implementations support isspace(-2) as a synonym for isspace
(254), which handles \xfe whether or not you remembered to go through an 
unsigned char value.  But the REAL kicker is that in some locales:

isspace(0377) != isspace('\377')

Why?  '\377' can promote to -1 (rather than 0xff) if char is signed, EOF is 
typically -1 (although that is not required by POSIX), and there are locales 
where byte 255 is indeed a space.

Hence, calling isfoo((char)c) is asking for problems if c ever has the eighth 
bit set (and particularly if it is 255), but calling isfoo((unsigned char)c) is 
safe if you know that c is not EOF.

The gnulib module c-ctype introduces macros like c_isspace, which:
a) always give answers from the C locale, regardless of the current locale
b) explicitly documents that c_isspace(-2) is well-defined.

But for this particular context, I think you really do want to query whether a 
character is printable in the user's current locale.

> > * src/scan-gram.l (splice): Avoid compiler warning.
> 
> I would say the correct top-level construct is 
> <SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER> rather than splice.  git doesn't 
> know how to read Flex specifications.

OK; I'll adjust that before pushing.

-- 
Eric Blake







reply via email to

[Prev in Thread] Current Thread [Next in Thread]