coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: numfmt (=print 'human' sizes) updates


From: Assaf Gordon
Subject: Re: numfmt (=print 'human' sizes) updates
Date: Wed, 26 Dec 2012 17:40:25 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.4) Gecko/20120510 Icedove/10.0.4

Hello,

Attached is an updated numfmt, with the following two changes:
1. "--format" support
2. optionally ignoring input errors.


Regarding output format:
I've implemented a limited subset of the functionality: it supports only format 
strings with the syntax "%['][-][N]f" (while of course allowing characters 
before/after the % directive). I did not implement additional functionality, as 
you've described (in previous email). I hope it stil addresses most of the 
common use-cases. 

Some examples:
    $ ./src/numfmt --format --%f-- 5000
    --5000--
    $ ./src/numfmt --format --%f-- --to=iec 5000
    --4.9K--
    $ ./src/numfmt --format --%10f-- 5000
    --      5000--
    $ ./src/numfmt --format --%-10f-- 5000
    --5000      --
    $ ./src/numfmt --format --%10f-- --to=si 5000
    --      5.0K--
    $ ./src/numfmt --format --%-10f-- --to=si 5000
    --5.0K      --
    $ ./src/numfmt --format --%-10f-- --to=iec-i 5000
    --4.9Ki     --
    $ LC_ALL=fr_FR.utf8 ./src/numfmt --format "--%'10f--" 5000
    --     5 000--
    $ LC_ALL=en_US.utf8 ./src/numfmt --format "--%'-10f--" 5000
    --5,000     --
    $ ./src/numfmt --format --%10f-- --suffix B 5000
    --     5000B--
    $ ./src/numfmt --format --%-10f-- --suffix B 5000
    --5000B     --

The following two are now equivalent:
   ls -lh | ./src/numfmt --field 5 --header --from=iec --padding 10
   ls -lh | ./src/numfmt --field 5 --header --from=iec --format %10f

Any 'too-complicated' format will trigger an error:
   $ ./src/numfmt --format %4.5f 5000
   ./src/numfmt: invalid format '%4.5f', directive must be %['][-][N]f
   $ ./src/numfmt --format %+4g 5000
   ./src/numfmt: invalid format '%+4g', directive must be %['][-][N]f
 



Regarding ignoring input errors:
I've restructured the code, so that all the parsing and converting is done 
before any output is printed. 
Any errors/warnings will be printed before any output (not interleaving output 
lines and error messages).

I'm not sure that it's optimal to always ignore conversion errors (and print 
the input as-is) in all cases.
It could lead to invalid and very confusing output for unsuspecting users.
Example:
   echo "5.0M" | numfmt --to=si   -> "5.0M"
Which is incorrect, because the program should not accept suffixed numbers on 
the input (There's no "--from=si").
This is just one case amongst many other possible confusions.
And not all users will look at STDERR messages, if they redirect output to a 
file and ignore what's printed on the screen, or try something like
  VAR=$(numfmt --to=si "$TEXT")
and "$TEXT" is not valid.


As a compromise, I added (yet another) option of "--ignore-errors".
By default, any input error will stop the program and exit with code 1.
If "--ignore-errors" is used, invalid input lines will trigger a warning to 
stderr, followed by the original line, and will eventually exit with status of 
2.

This way we can assume that a user who knows to use "--ignore-errors" will also 
know to explicitly check exit codes and error messages. And a novice user will 
just get an immediate error and empty output.

Examples:
    ## Default - no STDOUT output if input is invalid
    $ ./src/numfmt --to=si 10M ; echo STATUS=$?
    ./src/numfmt: rejecting suffix in input: '10M' (consider using --from)
    STATUS=1
    
    ## ignore errors will pass input as-is
    $ ./src/numfmt --ignore-errors --to=si 10M ; echo STATUS=$?
    ./src/numfmt: rejecting suffix in input: '10M' (consider using --from)
    10M
    STATUS=1
    

    ## multi-lined input - stop at first error
    $ printf "10000\n2M\n30000" | ./src/numfmt --to=si ; echo STATUS=$?
    10K
    ./src/numfmt: rejecting suffix in input: '2M' (consider using --from)
    STATUS=1

    ## --ignore-errors will process all lines   
    $ printf "10000\n2M\n30000" | ./src/numfmt --ignore-errors --to=si ; echo 
STATUS=$?
    10K
    ./src/numfmt: rejecting suffix in input: '2M' (consider using --from)
    2M
    30K
    STATUS=2
    

    ## multi-lined input - stop at first error
    $ printf "A 1M\nB\nC 3G\n" | ./src/numfmt --from=si --field 2 ; echo 
STATUS=$?
    A 1000000
    ./src/numfmt: input line is too short, no numbers found to convert in field 
2
    STATUS=1
    
    ## --ignore-errors will process all lines   
    $ printf "A 1M\nB\nC 3G\n" | ./src/numfmt --ignore-errors --from=si --field 
2 ; echo STATUS=$?
    A 1000000
    ./src/numfmt: input line is too short, no numbers found to convert in field 
2
    B
    C 3000000000
    STATUS=2

Of course, if there are no errors, exit code is 0 regardless of 
"--ignore-errors".


Comments are welcomed,
 -gordon


Attachment: numfmt.10.patch.xz
Description: application/xz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]