bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gawk-stable] bug: fatal error when getline from directory


From: Paolo
Subject: Re: [gawk-stable] bug: fatal error when getline from directory
Date: Mon, 5 Jan 2009 00:25:37 +0100
User-agent: Mutt/1.3.28i

On Sun, Jan 04, 2009 at 07:58:13AM -0700, Eric Blake wrote:
...
> That definition carries over into POSIX 2008, pretty much unchanged:

same as the text I quoted @bottom in my msg

> But there is no ambiguity - it excludes directories, in part because

POSIX defines behaviour for text files as defined above, which btw means
also any binary file, provided it's no '\0', but leave it undefined
otherwise. 
Eg, /dev/random isn't such a text file, yet POSIX awk can read open it; 
likewise with eg /dev/hda - is it anything like a 'text file'?
nawk (seems to) treat a directory like a text file, and get EOF - ie it 
sees an empty file, as you'd get with a straight open(2), read(2):

#--[rd.c: cc rd.c -o rd]----
...
 fd=open("/tmp",O_RDONLY);
 printf("fd=%d\n",fd);
 if (fd > 0) {
   while (read(fd,&c,1) > 0)
     printf("%c",c);
   close(fd);
 } else
   perror("/tmp");
 printf("EOF\n",c);
...
$ rd
fd=3
EOF

which looks like a lazy - er, elegant - way of dealing with such special
case.

> > $ echo -e -n '1\n1\x002\x003\x004\x00'| gawk 'BEGIN{RS="\0"}{ print 
> > "-"$0"-"}'
> 
> The fact that gawk supports this is an extension; it is not required by

and I like it - btw it's same behaviour with --compat and --posix.

> POSIX.  You did not give gawk a text file in this example, but gawk went

so? should it have complained with "fatal: '-' is not a text file"?

> ahead and used the final unterminated line as though a newline had been
> present in the original.  (By the way, printf is more portable than echo).

I don't get your point, it's correctly read in the whole file, and chopped
it by records using the specified RS. Plain awk only sees 1st 'record', ie
'1\n1' discarding everything beyond 1st '\0'.

> 
> > *awk operates on stdin as well, whose type is undefined
> 
> Have you ever heard of fstat?  It is easy to determine if stdin is a

can't swear, but guess I did once ...

> directory, in which case an error can be produced.

sorry, bad wording, was thinking of 'text' vs 'binary' stream

> You just restated my argument - the only sane way to handle directories is
> to consistently make them cause an error.

nope, seems you didn't get my point: it's wrong to *abort* script just 
because getline stumbled on an unreadable/unknown file/entity/whatever; 
the specs only requires that awk bails out on file *operands* error:

"CONSEQUENCES OF ERRORS

    If any file operand is specified and the named file cannot be accessed,
    awk shall write a diagnostic message to standard error and terminate 
    without any further action."

but an error on getline should just return -1, *not* exit, like in

$ ls -l /tmp/a
--w-------  ... /tmp/a
$ gawk 'BEGIN{r=getline < "/tmp/a"; print "r="r" -"$0"- ERRNO=" ERRNO}'
r=-1 -- ERRNO=Permission denied
$ mawk 'BEGIN{r=getline < "/tmp/a"; print "r="r" -"$0"- ERRNO=" ERRNO}'
r=-1 -- ERRNO=

does it sound consistent this behaviour:

$ awk 'BEGIN{r=getline; print "r="r" -"$0"- ERRNO=" ERRNO}' < /root
bash: /root: Permission denied
$ awk 'BEGIN{r=getline < "/root"; print "r="r" -"$0"- ERRNO=" ERRNO}' 
r=-1 -- ERRNO=
$ awk 'BEGIN{r=getline; print "r="r" -"$0"- ERRNO=" ERRNO}' < /tmp 
awk: read error (Is a directory)
$ awk 'BEGIN{r=getline < "/dev/cdrom"; print "r="r" -"$0"- ERRNO=" ERRNO}'
[and CDD starts spinning, awk slurps in whole CD ... <Ctrl-C>]

Again, I agree that reading a dir properly is matter for a plug-in or better
some '/dir/' extension, though I deem it wrong to either bail out or move
over like on EOF, else script can't be designed to deal with getline error.


-- 
paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]