bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] gawk printf misunderstand


From: Nelson H. F. Beebe
Subject: Re: [bug-gawk] gawk printf misunderstand
Date: Thu, 1 Nov 2012 10:29:21 -0600 (MDT)

Yesterday, in response to a posting, Arnold reported on this list that
the printf 0 flag handling in gawk had been changed to NOT provide
zero fill with a format "%04s".

In general, awk (and other scripting language) implementations of
printf() have usually followed the native C implementations.  

Here is what Section 7.19.6.1 of the ISO C99 Standard (technical
change 3) has to say about that flag:


    0         For d, i, o, u, x, X, a, A, e, E, f, F, g, and G 
              conversions, leading zeros (following any indication of
              sign or base) are used to pad to the field width rather
              than performing space padding, except when converting an
              infinity or NaN. If the 0 and - flags both appear, the 0
              flag is ignored. For d, i, o, u, x, and X conversions,
              if a precision is specified, the 0 flag is ignored. For
              other conversions, the behavior is undefined.

Notice in particular the last sentence.

The 1989 and 2011 ISO C Standards say essentially the same thing.

Because leading zero filling ensures that a numeric output field
remains numeric, it seems most sensible to ignore it for nonnumeric
output fields, which should then retain leading space filling instead.

Nevertheless, the "behavior is undefined" phrase suggests that
existing practice might be examined.  Here is what I found for the
statement printf("%04s\n", "ff"):

        all GNU/Linux:          output is "  ff" and some additionally warn 
                                at compile time
                                "'0' flag used with '%s' gnu_printf format:

        FreeBSD:                "00ff"
        Mac OS X:               "00ff"
        MirBSD:                 "00ff"
        NetBSD:                 "00ff"
        OpenBSD:                "00ff"
        Solaris:                "00ff"

        -lmcw                   "  ff"

The latter is my MathCW library, which implements all of C99's -lm,
plus the printf/scanf families, and much more, and on several
platforms, also supports decimal floating-point arithmetic.

It MIGHT be worth changing the gawk user manual to note that gawk's
handling of printf() format items is intended to follow the behavior
mandated by the ISO C Standards, and when those standards say
"undefined behavior", to then fall back to the behavior of the glibc
implementations.  That would seem to be better than to have gawk try
to supply its own implementation choices for ~undefined behavior".

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: address@hidden  -
- 155 S 1400 E RM 233                       address@hidden  address@hidden -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]