[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bugs in unexpand(1) version 6.10

From: Kevin O'Gorman
Subject: Bugs in unexpand(1) version 6.10
Date: Thu, 22 Jan 2009 11:20:47 -0800 (PST)

Three oddities in unexpand have been noted by my students here.

Version information:
address@hidden Test $ unexpand --version
unexpand (GNU coreutils) 6.10
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David MacKenzie.
address@hidden Test $ 

1) Replacement of tabs.  This is backwards since unexpand is supposed
to convert the other way.  There's a bash script
attached that illustrates the behavior: tabtest.sh.   When I run
it, I get this output:

address@hidden Test $ bash tabtest.sh
0000000 61 09 62 20 63 0a
          a  \t   b       c  \n
0000000 61 20 62 20 63 0a
          a       b       c  \n
address@hidden Test $

2) Infinite output when conflicting options are given.  It seems to me
that a suitable diagnostic message would be better.  The test case is the
attached testcase.sh script.  When I run it , I get this (runnnnon error message
is from the original):

address@hidden Test $ bash testcase.sh
testcase.sh: line 5: 22245 File size limit exceededunexpand -t2 -t5  > 
testoutput <<EOF

-rw-r--r-- 1 kevin kevin 1024 2009-01-22 10:14 testoutput
0000000 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09
         \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t  \t
address@hidden Test $

Single spaces leading up to a tab are treated inconsistently.  Sometimes they
are replaced by a tab and sometimes not.  The info page is vague enough to
allow either interpretation, but the variations seem undesirable.

If there's a good reason for the behavior, it should be documented.

I would note that the POSIX.1 man page is explicit, and allows only for changing
an initial sequence of blanks (not at issue here) or two or more blanks leading
up to a tab (also not at issue) so that a POSIX-compliant implementation would
not do conversions at all in the case of single blanks.  This seems consistent
with the motivation of making the file smaller, and avoiding changes that do
not further that end.

Test case is in the script testspace.sh
When I run it I get

address@hidden Test $ bash testspace.sh
0000000 61 62 63 20 64 65 66 20 20 67 0a
          a   b   c       d   e   f           g  \n
0000000 61 62 63 20 64 65 66 09 20 67 0a
          a   b   c       d   e   f  \t       g  \n
address@hidden Test $

The blank betweed "c" and "d" is not converted, but
the blank after "f" is converted to a tab (^T).
It is not at all clear why, since they both lead up to a tab stop.
One surmises that the following blank is making a difference, but it's
hard to see a motivation for the distinction.

I submit that it is just as well not to convert in both cases, as that
is most consistent with POSIX.

In any event, the documentation should be more clear about what cases are
handled and how.

++ kevin

Kevin O'Gorman, PhD   http://users.csc.calpoly.edu/~kogorman

Attachment: testspace.sh
Description: application/shellscript

Attachment: testcase.sh
Description: application/shellscript

Attachment: tabtest.sh
Description: application/shellscript

reply via email to

[Prev in Thread] Current Thread [Next in Thread]