[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: large integer truncation in regex module
From: |
Steven M. Schweda |
Subject: |
Re: large integer truncation in regex module |
Date: |
Sat, 26 May 2012 15:24:56 -0500 (CDT) |
Re: http://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00154.html
> [...] I think an ifdef may be used instead [...]
I agree. I, too, recently ran into this problem, in my case, on VMS.
On Alpha (and IA64, both 64-bit systems), the result is annoying
informational messages:
ALP $ cc /include = [] /noobject regex.c
dfa->word_char[0] = UINT64_C (0x03ff000000000000);
..............................^
%CC-I-INTCONSTTRUNC, In this statement, conversion of the constant "0X03FF000000
000000" to unsigned long type will cause data loss.
at line number 961 in file ALP$DKC0:[UTILITY.SOURCE.ZIP.regex]regcomp.c;1
dfa->word_char[1] = UINT64_C (0x07fffffe87fffffe);
..............................^
%CC-I-INTCONSTTRUNC, In this statement, conversion of the constant "0X07FFFFFE87
FFFFFE" to unsigned long type will cause data loss.
at line number 962 in file ALP$DKC0:[UTILITY.SOURCE.ZIP.regex]regcomp.c;1
On VAX (a 32-bit system), it's fatal. Even with a lame work-around:
#define UINT64_C(x) x##UL
to avoid warnings like the following, caused by trying to use "ULL" with
a compiler which doesn't understand "long long":
dfa->word_char[0] = UINT64_C (0x03ff000000000000);
..............................^
%CC-W-INVALTOKEN, Invalid token discarded.
At line number 961 in ALP$DKC0:[UTILITY.SOURCE.ZIP.REGEX]REGCOMP
.C;1.
one still gets these (fatal) complaints:
GIMP $ cc /include = [] /noobject regex.c
dfa->word_char[0] = UINT64_C (0x03ff000000000000);
..............................^
%CC-E-INTCONST, Ill-formed integer constant.
At line number 961 in ALP$DKC0:[UTILITY.SOURCE.ZIP.REGEX]REGCOMP
.C;1.
dfa->word_char[1] = UINT64_C (0x07fffffe87fffffe);
..............................^
%CC-E-INTCONST, Ill-formed integer constant.
At line number 962 in ALP$DKC0:[UTILITY.SOURCE.ZIP.REGEX]REGCOMP
.C;1.
All of which are caused by waiting until run-time to do work which
should have been done by the C preprocessor at compile-time.
> Generally speaking we prefer 'if (xxx)' to '#if xxx' where
> either will do, because the former is easier to read and
> reason about.
Really? I must be getting old. Ever since I started programming
computers, the goal was always a program which worked correctly, and
which, subject to the usual engineering trade-offs, occupied minimal
storage, and ran maximally fast. When targeted at multiple system
types, portability was highly valued, too. Any (claimed) difference in
readability and reasonability between "if (xxx)" and "#if xxx" would not
be given priority over all other considerations, especially in a case
which hits the trifecta of bad code: bigger size, lower speed, and worse
portability.
Which is better, a (small but questionable) advantage in code
readability, or a hideous pile of lame work-arounds needed to
accommodate that "advantage"?
> If the only problem with 'if (xxx)' is a bogus
> warning by some random compiler then it's probably better to
> leave it alone (and get the compiler fixed....).
If "a bogus warning by some random compiler" actually means "a
legitimate informational/warning/error diagnostic from any compiler
which I don't use", then this might make some sense. But probably not.
No doubt, the advice to "get the compiler fixed" was well-meant, but it
would be difficult to apply to a closed-source product like DEC/Compaq C
on VMS VAX, which has probably not seen a fix since around
"16-JAN-2001", and probably never will see another (even to replace
"Compaq" with "HP").
I realize that "GNU" and "portable" are spelled differently for a
reason, and I realize that I'm old and stupid, but I can see no valid
reason to retain this perverse, inefficient, unportable code, when a fix
would be so easy.
"This program is distributed in the hope that it will be useful,
[...]." But not too useful to too many people, apparently. I'd sound
more appreciative of, and grateful for, this (would-be-)nice package if
some deliberate implementation decisions didn't make it so difficult to
use.
For the curious, initially, I was considering the use of the GNU
regex package in the Info-ZIP Zip and UnZip programs, but our
portability requirements seem to be a little more stringent than those
for GNU regex, so I've dropped the idea for now. Then I ran into the
same problems in the latest Wget kit, which now incorporates GNU regex
code, and which I still try to keep available on VMS. (Imagine my
surprise when a Web search for the fix for this problem found instead a
discussion wherein the obvious fix was rejected on esthetic grounds.)
For the record:
ALP $ cc /version
HP C V7.3-009 on OpenVMS Alpha V8.3
GIMP $ cc /version
Compaq C V6.4-005 on OpenVMS VAX V7.3
Thanks for your attention.
------------------------------------------------------------------------
Steven M. Schweda address@hidden
382 South Warwick Street (+1) 651-699-9818
Saint Paul MN 55105-2547
- Re: large integer truncation in regex module,
Steven M. Schweda <=