[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in dif
From: |
Jarno Suni |
Subject: |
Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes) |
Date: |
Sun, 27 Sep 2015 13:25:14 +0300 |
On Sat, 26 Sep 2015 21:20:47 +0300
Aharon Robbins <address@hidden> wrote:
> > > Thanks for this code.
> > >
> > > Are you willing:
> > >
> > > 1. To sign paperwork putting this code into the public domain?
> >
> > How would signing paperwork happen?
>
> I would ask Karl Berry to send you the paperwork to print out and
> mail in.
Sounds complicated, but I will, if it is necessary for the code to be
included.
> > Would the author be mentioned
> > anywhere in the manual?
>
> I would credit you on the page. :-)
Ok
> > I have to change the code a bit to make it return proper "+nan"+0 in
> > error cases. The test harness could be simplified. Or even removed,
> > what do you think?
>
> Just comment it out. I can include it in the file without it having
> to appear in the manual.
In which file?
> > > 2. To write a prose description of how it works for the
> > > manual?
> >
> > Well, it depends. The code and comments tell how it works. I could
> > add few comments. I suppose I could do some kind of usage
> > instructions or manual for it, too. Is there something unclear in
> > how the converter works?
>
> I haven't read the code yet. But I'm looking for prose in the current
> style of the manual; the idea is to replace the existing code with
> yours.
I tuned the program a bit and added some comments; see below. I hope it
helps to understand the program. You may add a further description, if
you will. Please review the program.
> > BTW Perl is going to change its handling of octal numbers with
> > version 6:
> > http://design.perl6.org/S02.html#Radix_markers
>
> Interesting, but not likely to influence anything I will do... :-)
Also many other programming languages recognize octals by 0o prefix:
https://en.wikipedia.org/wiki/Octal#In_computers
Though 0c would be more clear IMO, if upper case is allowed (0O vs.
0C). I used /^0[oO]/ in the program because traditionally awk allows
upper case and at least the lower case o seems to be a common practice
nowadays. (Perl does not support upper case in the prefix.)
#!/usr/bin/awk -f
# convert_from_base --- Generic function to convert string representing
# a natural number of given base to number. Start conversion from i'th
# character. Return the converted number, or nan, if the string is
# invalid for the given base.
function convert_from_base(base, str, i, ret, n, p)
{
n = length(str)
if (i > n) return nan # expect at least one digit
if ((ret=v[substr(str, i, 1)])!="" && ret<base) {
while (i < n) {
i++
if ((p=v[substr(str, i, 1)])!="" && p<base)
ret = ret*base + p
else return nan
}
return ret
} else return nan
}
# my_strtonum --- convert string to number using given base. If no base
# is given, detect base from prefix, or if not given, expect decimal
# number that may also be a floating point number; in other cases only
# string representing naturnal number is valid. Support prefixes "0b",
# "0o", "0d" and "0x" for binary, octal, decimal and hexadecimal
# numbers, respectively; case of the letter does not matter. Return the
# converted number, or if the base or the string is invalid, return a
# special value nan. Used awk's accuracy of arithmetic limits how big
# numbers can be converted accurately. If decimal separator is used,
# expect the same character as what command print uses as decimal
# separator.
function my_strtonum(str, base)
{
if (base) {
if (base < 2 || base > ld) {
print "ERROR: base should be within [2," ld "]">"/dev/stderr"
return nan
}
# expect natural number of given base
return convert_from_base(base, str, 1)
} else if (substr(str, 1, 1) == "0") {
if (str ~ /^.[bB]/) {
# expect natural binary
return convert_from_base(2, str, 3)
} else if (str ~ /^.[oO]/) {
# expect natural octal
return convert_from_base(8, str, 3)
} else if (str ~ /^.[dD]/) {
# expect natural decimal
return convert_from_base(10, str, 3)
} else if (str ~ /^.[xX]/) {
# expect natural hexadecimal
return convert_from_base(16, str, 3)
}
}
if (str !~ rd) return nan;
# valid decimal, possibly floating point
return str + 0
}
BEGIN {
# Define some global constants:
nan="+nan"+0 # marks "Not a Number"
digits="0123456789abcdefghijklmnopqrstuvwxyz"
ld=length(digits) # maximum base (36)
# Create a lookup table for values of digits:
for(i=0; i<length(digits); i++) v[substr(digits,i+1,1)]=i
# Upper case digits are equal to lower case:
for(i=10; i<length(digits); i++) v[toupper(substr(digits,i+1,1))]=i
d=substr(sprintf("%g",1.1),2,1); # d is the decimal separator
# that may vary according to awk implementation, command line
#options and used locale.
rd="^[-+]?([0-9]+\\" d "?|\\" d "[0-9])[0-9]*([eE][-+]?[0-9]+)?$"
# rd is regular expression to match decimal floating point number.
# test harness
#a[0]="-.1"
#a[1]="25"
#a[2]=".31"
#a[3]="0123"
#a[4]="0xdeadBEEF"
#a[5]="123.45"
#a[6]="1.e3"
#a[7]="1.32"
#a[8]="1.32E2"
#a[9]=".e2"
#a[10]="3.9e-2"
#a[11]="1e5"
#a[12]=""
#a[13]="1,123"
#a[14]="awk"
#a[15]="1 000.4"
#a[16]=".3e-2"
#a[17]="-"
#a[18]="."
#a[19]="+."
#a[20]="deadBEEF"
#a[21]="deadbeef"
#a[22]="oajlaselkjZ"
#a[23]="0xdead"
#a[24]=",1"
#a[25]="0,23"
#a[26]="3e-2"
#a[27]="070"
#a[28]="1.2a"
#a[29]="1,2a"
#a[30]="01e1"
#a[31]="รถ"
#a[32]="0b101"
#a[33]="0o76"
#a[34]="0d96"
#a[35]="0Xf"
#for (i=0; i in a; i++) {
#printf "\"%s\": %g, \"%s\", %d, \"%s\", add 1: \"%s\"\n",
#a[i],
#my_strtonum(a[i]), my_strtonum(a[i]),
#my_strtonum(a[i],ld), my_strtonum(a[i],ld),
#my_strtonum(a[i],ld)+1
##print strtonum(a[i]), strtonum(a[i]"") # works only by gawk
#}
}
Regards,
Jarno
--
Jarno Ilari Suni - http://www.iki.fi/8/
- Re: [bug-gawk] Handling hexadecimals in different modes, Jarno Suni, 2015/09/01
- Re: [bug-gawk] Handling hexadecimals in different modes, Jarno Suni, 2015/09/06
- Re: [bug-gawk] Handling hexadecimals in different modes, Jarno Suni, 2015/09/07
- Re: [bug-gawk] Handling hexadecimals in different modes, arnold, 2015/09/08
- Re: [bug-gawk] Handling hexadecimals in different modes, Jarno Suni, 2015/09/09
- Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes), Jarno Suni, 2015/09/22
- Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes), Jarno Suni, 2015/09/22
- Message not available
- Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes), Jarno Suni, 2015/09/25
- Message not available
- Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes),
Jarno Suni <=
- Re: [bug-gawk] mystrtonum for any awk (Was: Handling hexadecimals in different modes), Jarno Suni, 2015/09/30
- Re: [bug-gawk] Handling hexadecimals in different modes, Jarno Suni, 2015/09/14
- Re: [bug-gawk] Handling hexadecimals in different modes, Aharon Robbins, 2015/09/15