bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] suggestions


From: Graham Ewart
Subject: [bug-gawk] suggestions
Date: Mon, 9 Jan 2012 09:46:15 -0500

First comments: I would like to see the sequence "\#" treated differently.  Instead of "As soon as awk sees the ‘#’ that starts a comment, it ignores everything on the rest of the line", I would like the rule to be "As soon as awk sees the ‘#’ that starts a comment, it ignores the # and everything on the rest of the line" so that the "\" before the comment starter would be treated as a line continuation.  This would facilitate commentary in the middle of statements.  Today the sequence "\#" gives the error "backslash not last character on line", so this would not break working code.

Second, and the real reason I'm writing, is that I'd like to see consecutive constant regular expressions concatenated.  So instead of writing:

gawk '// {
   match ($0,/[[:space:]]*(\([[:digit:]]+\))[[:space:]]*([^-[:space:]]+)[[:space:]]*-[[:space:]]*([^[:space:]]+)[[:space:]]*/,result)
   print "Number: " result[1], "Prefix: " result[2], "Suffix "result[3]
}'

I could write:

gawk '// {
   match ($0,/[[:space:]]*/             \# optional leading spaces
             /(\([[:digit:]]+\))/       \# digits in parens
             /[[:space:]]*/             \# blah blah blah...
             /([^-[:space:]]+)/  /[[:space:]]*/  /-[[:space:]]*/  /([^[:space:]]+)/  /[[:space:]]*/,result)
   print "Number: " result[1],"Prefix: " result[2],"Suffix "result[3]
}'

This is what I do today inside a script:

x() {
   for p in "$@" ; do
      if [[ $p == '
' || ${p:0:1} == "#" ]]; then continue; fi
      echo -n $p
   done
}
...
...
echo "(123) xxx - yyy" |
gawk '// {
   match ($0,/'$(x                  '
'          '[[:space:]]*'           '# optional leading spaces
'          '(\([[:digit:]]+\))'     '# digits in parens
'          '[[:space:]]*'           '# blah blah blah...
'          '([^-[:space:]]+)'       '
'          '[[:space:]]*'           '
'          '-'                      '
'          '[[:space:]]*'           '
'          '([^[:space:]]+)'        '
'          '[[:space:]]*'           '
'            )'/, result)
   print "Number: " result[1],            # show match result
         "Prefix: " result[2],
         "Suffix "result[3]
}'

An alternative to my first suggestion, but only for this particular case, would be to ignore newlines following constant regular expessions, but I'm not proficient enough in awk to know if this change would have other (negative) effects.

Thanks for listening,


Graham Ewart





reply via email to

[Prev in Thread] Current Thread [Next in Thread]