bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41518: Bug in od?


From: Bob Proulx
Subject: bug#41518: Bug in od?
Date: Fri, 29 May 2020 16:32:50 -0600

Yuan Cao wrote:
> > https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-_0027od-_002dx_0027-command-prints-bytes-in-the-wrong-order_002e
> 
> Thanks for pointing me to this documentation.
> 
> It just feels strange because the order does not reflect the order of the
> characters in the file.

It feels strange in the environment *today*.  But in the 1970's when
the 'od' was written it was perfectly natural on the PDP-11 to print
out the native machine word in the *native word order* of the PDP-11.
During that time most software operated on the native architecture and
the idea of being portable to other systems was not yet common.

The PDP-11 is a 16-bit word machine.  Therefore what you are seeing
with the 2-byte integer and the order it is printed is the order that
it was printed on the PDP-11 system.  And has remained unchanged to
the present day.  Because it can't change without breaking all
historical use.

For anyone using od today the best way to use -x is -tx1 which prints
bytes in a portable order.  Whenever you think to type in -x use -tx1
instead.  This avoids breaking historical use and produces the output
that you are wanting.

> I think it might have been useful to get the "by word" value of the file if
> you are working with a binary file historically. One might have stored some
> data as a list of shorts. Then, we can easily view the data using "od -x
> data_file_name".
> 
> Since memory is so cheap now, people are probably using just using chars
> for text, and 4 byte ints or 8 byte ints where they used to use 2 byte ints
> (shorts) before. In this case, the "by word" order does not seem to me to
> be as useful and violates the principle of least astonishment needlessly.

But changing the use of options to a command is a hard problem and
cannot be done without breaking a lot of use of it.  The better way is
not to try.  The options to head and tail changed an eon ago and yet
just in the last week I ran across a posting where the option change
bit someone in the usage change.

And since there is no need for any breaking change it is better not to
do it.  Simply use the correct options for what you want.  -tx1 in
this case.

> It might be interesting to change the option to print values by double word
> or quadword instead or add another option to let the users choose to print
> by double word or quadword if they want.

And the size of 16-bits was a good value for a yester-year.  32-bits
has been a good size for some years.  Now 64-bits is the most common
size.  The only way to win is not to play.  Better to say the size
explicitly.  And IMNHO the best size is 1 regardless of architecture.

  od -Ax -tx1z -v

Each of those options have been added over the years and each changes
the behavior of the program.  Each of those would be a breaking change
if they were made the default.  Best to ask for what you want explicitly.

I strongly recommend https://www.ietf.org/rfc/ien/ien137.txt as
required reading.

Bob





reply via email to

[Prev in Thread] Current Thread [Next in Thread]