bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

size_t option parsing question


From: Eric Blake
Subject: size_t option parsing question
Date: Thu, 28 Sep 2006 15:43:23 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

I'm trying to make m4 -l<num> comply with POSIX guidelines for numeric argument 
parsing (for example, 'm4 -l a' should issue a complaint that 'a' is not 
numeric, although this is silently accepted in M4 1.4.7).  I decided to turn to 
coreutils for inspiration.

I noticed that various coreutils parse command line options as size_t values.  
For example, uniq.c provides a static size_opt(), used by -f and -s, among 
others.  But its current implementation raises some questions in my mind:

First, POSIX requires that options that take numeric arguments must accept 
decimal numbers at least as big as (1<<31)-1 as numeric arguments, and 
distinguish between the error of out of range (for arguments that could be 
parsed as numbers but are too big for the use of the option) and syntactically 
invalid (for arguments that can't be parsed as a number, whether because of non-
numeric characters, or because it was all digits but caused overflow).  But 
POSIX also states that size_t is only guaranteed to be 16 bits.  Is coreutils 
safe in assuming that size_t is at least 31 bits on all platforms that it 
targets?  If not, then size_opt() is violating POSIX by giving the same error 
message for a numeric argument that is less than 32 bits but bigger than 
size_t, with no explicit permission from POSIX that -f and -s can have a 
smaller-than-31-bit range.

Second, POSIX does not always guarantee that size_t fits in a long.  It states 
that for all platforms, at least one programming environment makes this 
guarantee, but does coreutils ensure that it is using this programming 
environment, or is there a case on 64-bit platforms where size_t is 64 bits but 
long is only 32?  If the latter, then size_opt() is buggy for using xstrtol 
instead of xstrtoumax.

Next, there is still a big FIXME in xstrtol debating whether the API should be 
changed slightly and all callers updated, so that clients of xstrtol can easily 
get suffix parsing by passing in NULL instead of a long list of letters.  The 
longer we wait to resolve this API, the more projects will be impacted by a 
change.  And would size_opt be useful in gnulib, alongside xstrtol, rather than 
buried in uniq.c?

Finally, it looks like suffix parsing is a purely upward compatible extension 
to POSIX in many cases.  For example, 'uniq -s1M' seems like it would be a nice 
extension as shorthand for 'uniq -s1000000', rather than the current behavior 
of complaining of an invalid number.

-- 
Eric Blake






reply via email to

[Prev in Thread] Current Thread [Next in Thread]