First public beta of upcoming gawk 4.0 release

From: Aharon Robbins
Subject: First public beta of upcoming gawk 4.0 release
Date: Fri, 27 May 2011 10:14:22 +0300

This note is to announce the first BETA release of GNU Awk 4.0.

It is available from:

This release represents over a year and half's very hard work by a
number of people, and introduces a significant number of important new
features, as well as some more minor improvements.  The NEWS file is
appended below.

As far as I can tell, the documentation and code have both hit the
freeze point.

So, why do a beta release? So that you, yes you, the end user, can see
if anything I've done breaks gawk for you.  Then you can TELL ME ABOUT
IT so that I can fix it for the final release.


Arnold Robbins
   Copyright (C) 2010, 2011 Free Software Foundation, Inc.
   Copying and distribution of this file, with or without modification,
   are permitted in any medium without royalty provided the copyright
   notice and this notice are preserved.
Changes from 3.1.8 to 4.0.0

1. The special files /dev/pid, /dev/ppid, /dev/pgrpid and /dev/user are
   now completely gone. Use PROCINFO instead.

2. The POSIX 2008 behavior for `sub' and `gsub' are now the default.

3. The \s and \S escape sequences are now recognized in regular expressions.

4. The split() function accepts an optional fourth argument which is an array
   to hold the values of the separators.

5. New -b / --characters-as-bytes option that means "hands off my data"; gawk
   won't try to treat input as a multibyte string.

6. New --sandbox option; see the doc.

7. Indirect function calls are now available.

8. Interval expressions are now part of default regular expressions for
   GNU Awk syntax.

9. --gen-po is now correctly named --gen-pot.

10. switch / case is now enabled by default. There's no longer a need
    for a configure-time option.

11. Gawk now supports BEGINFILE and ENDFILE. See the doc for details.

12. Directories named on the command line now produce a warning, not
    a fatal error, unless --posix or --traditional.

13. The new FPAT variable allows you to specify a regexp that matches
    the fields, instead of matching the field separator. The new patsplit()
    function gives the same capability for splitting.

14. All long options now have short options, for use in `#!' scripts.

15. Support for IPv6 added via /inet6/... special file. /inet4/... forces
    IPv4 and /inet chooses the system default (probably IPv4).

16. Added a warning for /[:space:]/ that should be /[[:space:]]/.

17. Merged with John Haque's byte code internals. Adds dgawk debugger and
    possibly improved performance.

18. `break' and `continue' are no longer valid outside a loop, even with

19. POSIX character classes work with --traditional (BWK awk supports them).

20. Nuked redundant --compat, --copyleft, and --usage long options.

21. Arrays of arrays added. See the doc.

22. Per the GNU Coding Standards, dynamic extensions must now define
    a global symbol indicating that they are GPL-compatible. See
    the documentation and example extensions.

23. In POSIX mode, string comparisons use strcoll/wcscoll.

24. The option for raw sockets was removed, since it was never implemented.

25. If not in POSIX mode, gawk turns ranges of the form [d-h] into
    [defgh] before compiling a regexp.  Maybe this will stop all the
    questions about [a-z] matching uppercase letters.

26. PROCINFO["strftime"] now holds the default format for strftime().

27. Updated to latest infrastructure: Autoconf 2.68, Automake 1.11.1,
    Gettext 0.18.1, Bison 2.5.

28. Many code cleanups. Removed code for many old, unsupported systems:
        - Atari
        - Amiga
        - BeOS
        - Cray
        - MIPS RiscOS
        - MS-DOS with Microsoft Compiler
        - MS-Windows with Microsoft Compiler
        - NeXT
        - SunOS 3.x, Sun 386 (Road Runner)
        - Tandem (non-POSIX)
        - Prestandard VAX C compiler for VAX/VMS
        - Probably others that I've forgotten

29. If PROCINFO["sorted_in"] exists, for(iggy in foo) loops sort the
    indices before looping over them.  The value of this element
    provides control over how the indices are sorted before the loop
    traversal starts. See the manual.

30. A new isarray() function exists to distinguish if an item is an array
    or not, to make it possible to traverse multidimensional arrays.

31. asort() and asorti() take a third argument specifying how to sort.
    See the doc.
