[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gawk-stable] bug: fatal error when getline from directory
From: |
Paolo |
Subject: |
Re: [gawk-stable] bug: fatal error when getline from directory |
Date: |
Sun, 4 Jan 2009 11:11:01 +0100 |
User-agent: |
Mutt/1.3.28i |
On Sat, Jan 03, 2009 at 10:14:25PM -0700, Eric Blake wrote:
> > Different awks do different things when handed a directory. Brian
> > Kernighan's awk treats a read of a directory as EOF; I think that's wrong.
>
> POSIX 2008 states that awk is only required to operate on text files, and
> a directory does not qualify as a text file:
> http://www.opengroup.org/onlinepubs/9699919799/utilities/awk.html
>
> Therefore, you can define gawk to do whatever you would like, as a
> compliant client will never ask gawk to read a directory.
I disagree, the definition of 'text file' is rather broad/vague [1], and
actually gawk seems to conform to IEEE Std 1003.1-2001, so that a 'line of
characters' is a sequence of whatever char except the line-separator - '\n'
by default but can be any, '\0' included - eg:
$ echo -e -n '1\n1\x002\x003\x004\x00'| gawk 'BEGIN{RS="\0"}{ print "-"$0"-"}'
-1
1-
-2-
-3-
-4-
*awk operates on stdin as well, whose type is undefined. gawk's getline can
also open '/inet/' special files which generally are 'binary file'.
The example above also shows that Aharon's point on filename with '\n' isn't
specific to embedding readdir(3) 'mode', because you might have to deal
with such issue in any case, eg:
$ a=`echo -e '/tmp/1\n1'`
$ touch "$a"
$ ls -1 /tmp
1?1
...
$ gawk 'BEGIN{while ("ls /tmp"|getline) print "-"$0"-"}'
-1-
-1-
...
I think you *cannot* define gawk to do whatever you would like when getline
is given a dir, for predictability and consistency: if we don't allow for a
dir to be treated like a line-oriented file, then getline should return -1
like for any other error condition, since script author might need to catch
and handle such case *from within* the script (I, for one, do expect this).
Double-thinking of it though, I'd rather have getline always return -1 on
a dir, and eventually implment readdir(3) like a special mode, eg '/dir/'
the same as for other special fd, so that if you expect 'a' to be a dir
you'd say 'getline f < "/dir/tmp/a"' which of course would return -1 if 'a'
is not a dir, and set ERRNO accordingly.
--
paolo
[1] http://www.opengroup.org/onlinepubs/009695399/
" 3.392 Text File
A file that contains characters organized into one or more lines. The lines do
not contain NUL characters and none can exceed {LINE_MAX} bytes in length,
including the <newline>. Although IEEE Std 1003.1-2001 does not distinguish
between text files and binary files (see the ISO C standard), many utilities
only produce predictable or meaningful output when operating on text files. The
standard utilities that have such restrictions always specify "text files" in
their STDIN or INPUT FILES sections."
>
> I recently changed GNU m4 to outright reject all attempts to open
> directories as input files, printing an error message and exiting with
> non-zero status after processing all other input files. I just don't see
> any other choice that is both sane and portable in how to handle directory
> reads in any way that you could easily document, considering that systems
> vary in whether you are even allowed to read from a file descriptor
> visiting a directory.
>
> http://lists.gnu.org/archive/html/m4-patches/2008-09/msg00004.html
> http://git.savannah.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=4a5040d
>
> - --
> Don't work too hard, make some time for fun as well!
>
> Eric Blake address@hidden
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (Cygwin)
> Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAklgRbEACgkQ84KuGfSFAYDvNwCfVYeTtGKd6tzkJBYiUIQvk+dD
> oDEAnjCPaedSYTEx5enTBMdt2sLvFmKf
> =6e5v
> -----END PGP SIGNATURE-----
>
>
--
paolo
GPG/PGP id:0x3A47DE45 - B5F9 AAA0 44BD 2B63 81E0 971F C6C0 0B87 3A47 DE45
- 9/11: the outrageous deception & coverup: http://journalof911studies.com -
- [gawk-stable] bug: fatal error when getline from directory, Steffen Schuler, 2009/01/01
- Re: [gawk-stable] bug: fatal error when getline from directory, Aharon Robbins, 2009/01/03
- Re: [gawk-stable] bug: fatal error when getline from directory, John Cowan, 2009/01/03
- Re: [gawk-stable] bug: fatal error when getline from directory, Paolo, 2009/01/03
- Re: [gawk-stable] bug: fatal error when getline from directory, Eric Blake, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory,
Paolo <=
- Re: [gawk-stable] bug: fatal error when getline from directory, Eric Blake, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Andreas Schwab, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Paolo, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Eric Blake, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Paolo, 2009/01/05
- Re: [gawk-stable] bug: fatal error when getline from directory, Eli Zaretskii, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Paolo, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Andreas Schwab, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Eric Blake, 2009/01/04
- Re: [gawk-stable] bug: fatal error when getline from directory, Paolo, 2009/01/04